Bug #15872 large NOT IN (a,b,c,d) lists consume huge amounts of memory in DML queries
Submitted: 19 Dec 2005 22:56 Modified: 27 Apr 2006 14:53
Reporter: Domas Mituzas Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S1 (Critical)
Version:5.0.16 OS:Linux (Linux)
Assigned to: Sergey Petrunya CPU Architecture:Any

[19 Dec 2005 22:56] Domas Mituzas
Description:
Large UPDATE and DELETE queries using big NOT IN (...) lists cause serious memory leaks, even on empty tables. (1,2,...,3000) consumes 250MB, (1,2,...,3500) consumes 350MB, etc. 

Table types, field types are irrelevant here. Memory usage seems to be exponential. It does not happen on SELECTs. 

How to repeat:
 CREATE TABLE `test` (
  `id` int(11) NOT NULL auto_increment,
  `summa` tinyint(4) NOT NULL default '0',
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

UPDATE test SET summa=1 WHERE id NOT IN (1,2,3,...,10000); 
// mysqld should go out of memory here
[2 Feb 2006 16:23] Guy Baconnière
When do plan to resolve this critical issue ?

PHPBB a very popular forum can DoS any MySQL 5.0 server
using NOT IN (... 400k of IDs ...). By the way anyone flooding
a PHPBB sessions table online can make MySQL 5.0 use all
memory / resources... 

I have proposed to PHPBB Team a SQL alternative to NOT IN ()
see http://www.infomaniak.ch/phpbb2/phpBB2-MySQL5.txt
http://www.infomaniak.ch/phpbb2/phpBB2-MySQL5.patch

I hope you will fix this as soon as possible
[3 Feb 2006 9:46] Valeriy Kravchuk
Bug #17023 was marked as a duplicate of this one.
[9 Feb 2006 8:54] Valeriy Kravchuk
Bug #16041 was marked as a duplicate of this one. Time to increase priority, I think!
[1 Mar 2006 12:00] Valeriy Kravchuk
Bug #17790 and Bug #17992 were marked as duplicate of this one.
[14 Mar 2006 10:26] Dimitrij HIlt
This bug is present in mysql-5.0.19 too and is very critical, because lot of people has phpBB without any patches.

Dimi
[21 Mar 2006 14:29] Hartmut Holzgraefe
Duplicate report in bug #18387, there even leading to "out of memory" shutdowns
[3 Apr 2006 17:28] Bill Marrs
This bug took my websites down for about a day.  I upgraded to Fedora Core 5, which has MySQL 5.0 included in it (5.0.18, I think).  I upgraded from MySQL 4.1 to 5.0.  But, mysqld was crashing every few hours, usually with "Out of Memory" errors and causing many tables to need repairs.  I had to go back to MySQL 4.1 (which does not have this bug).  

I tracked down at least one of the crashes I was having to a large "DELETE ...NOT IN (...)" query.  When I run the DELETE query (which has about 20,000 UIDs in a NOT IN () list), I can watch the mysqld's memory usage grow and grow, until the machine starts paging its brains out, then processes fail to start because of lack or memory. Eventually mysqld crashes and restarts and table repairs begin.

I'm working on a perl script that will reproduce this.  Though, it's a bit tricky since a machine with enough memory may not crash.  But, the mysqld memory usage (a leak, I assume) is apparent regardless.

This bug is blocking me from upgrading to MySQL 5.0.
[17 Apr 2006 2:15] Morgan Tocker
Guy Baconnière commented on a workaround in his patch; use a left join.

DELETE FROM r 
USING phpbb_search_results AS r 
LEFT JOIN phpbb_sessions AS s 
   ON s.session_id = r.session_id  
WHERE 
   s.session_id IS NULL;
[25 Apr 2006 9:55] Dimitrij HIlt
Hi,

workaround is greate, but whats about a fix?

on Systems with lot of customer databases it is very difficult to use any workarounds.

Dimi
[25 Apr 2006 11:33] Heikki Hannikainen
Here's a script to reproduce the bug. Verified to crash 5.0.20a on Linux and 5.0.20 on FreeBSD. Doesn't crash 4.0 or 4.1.

http://he.fi/misc/incrash-mysql50.pl (only bug owner can attach files...)

"DELETE FROM $table WHERE col NOT IN (" . join(',', (1..10000)) . ")" (perl) makes mysqld eat over 1GB of memory, if there is an index (primary, unique or normal key) for the column, even if there are no rows in the table at all. If there is no index, it doesn't blow up.

This bug keeps us from upgrading to 5.0. It's hard to work around with a join, since the value set does not come from the database. "NOT IN (SELECT val FROM another_table)" would not blow the thing up.
[25 Apr 2006 17:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/5493
[25 Apr 2006 19:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/5505
[25 Apr 2006 20:32] Sergey Petrunya
Patch approved by Igor on irc
[25 Apr 2006 20:33] Sergey Petrunya
The fix has been pushed into 5.0.21 tree
[25 Apr 2006 20:44] Sergey Petrunya
Notes for the changelog:
The server used O(N^2) memory for queries that had "t.keypart NOT IN (const1, ... constN)"  in the WHERE clause. 
Now the server doesn't use O(N^2) memory anymore. Also, if the IN list has more than 10,000 elements, the "range" optimizer doesn't try to build a range access plan candidate for the entire sequence of intervals
(-inf < t.key < const_i1) OR (const_i1 < t.key < const_i2) OR  ... OR
   (const_iN < t.key < +inf)

but uses a short interval list instead:
 (-inf < t.key < const_i1) OR  (const_iN < t.key < +inf)
[27 Apr 2006 14:53] Paul DuBois
Noted in 5.0.21, 5.1.10 changelogs.

<literal>DELETE</literal> and <literal>UPDATE</literal>
statements that used large <literal>NOT IN
(<replaceable>value_list</replaceable>)</literal> clauses
could use large amounts of memory. (Bug #15872)
[15 Aug 2006 12:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10472

ChangeSet@1.2274, 2006-08-15 16:54:02+04:00, sergefp@mysql.com +1 -0
  BUG#21282: Incorrect query results for "t.key NOT IN (<big const list>) 
  * In fix for BUG#15872, a condition of type "t.key NOT IN (c1, .... cN)"
    where N>1000, was incorrectly converted to
      (-inf < X < c_min) OR (c_max < X)
    which was not correct. Now this conversion is removed, we dont produce
    any range lists for such conditions.
[15 Aug 2006 17:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10495

ChangeSet@1.2257, 2006-08-15 21:08:22+04:00, sergefp@mysql.com +3 -0
  BUG#21282: Incorrect query results for "t.key NOT IN (<big const list>) 
  In fix for BUG#15872, a condition of type "t.key NOT IN (c1, .... cN)"
  where N>1000, was incorrectly converted to
    (-inf < X < c_min) OR (c_max < X)
  Now this conversion is removed, we dont produce any range lists for such
  conditions.
[29 Aug 2006 13:23] Evgeny Potemkin
Fixed in 5.0.25
[4 Sep 2006 11:41] Evgeny Potemkin
Fixed in 5.1.12