Bug #42037 Queries containing a subquery with DISTINCT and ORDER BY could cause a server c
Submitted: 12 Jan 2009 1:08 Modified: 16 Feb 2009 2:12
Reporter: Sean Jenkins Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S1 (Critical)
Version:5.0.75, 4.1, 5.0, 5.1, 6.0 bzr OS:Linux (CentOS 4.7)
Assigned to: Gleb Shchepa CPU Architecture:Any
Tags: Bug #38191, distinct, memory leak, order by

[12 Jan 2009 1:08] Sean Jenkins
Description:
In reference to Bug #38191, this bug still occurs in 5.0.75 despite release notes indicating it resolved and verifying sources patched.  A standard install of Pligg (pligg.com) running the query "SELECT count(*) FROM pligg_pageviews,pligg_links WHERE `pv_type`='story' AND link_id IN(SELECT
DISTINCT link_id FROM pligg_links WHERE link_id=pv_page_id);" causes MySQL's memory to bloat leading to a server crash.

I realize the issue is not entirely similar, nor dissimilar, to the aforementioned bug, with the distinguishing difference being memory bloat leading to the server running out of memory and significant swapping.  Using a non-production server, I was able to trigger a 1 GB increase in memory by MySQL in under 2 seconds.

Linux box231.bluehost.com 2.6.27-6_1.intel.BHsmp #1 SMP Thu Nov 13 15:27:29 MST 2008 x86_64 x86_64 x86_64 GNU/Linux
QuadCore Intel(R) Xeon(R) CPU            5140  @ 2.33GHz
MemTotal:      8202080 kB
SwapTotal:     4192956 kB

[root@box231 root]# df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sdb1            140038192  42652604  95288436  31% /
none                   4101040         0   4101040   0% /dev/shm
/dev/sdc1            720911148   7786088 711027908   2% /home
/dev/sdd1            720918676    540272 718281252   1% /backup
/dev/sde1            720918676    510412 718311112   1% /backup/cpbackup/weekly
/ramdisk             140038192  42652604  95288436  31% /ramdisk._
/dev/ram0               253871     51603    202268  21% /ramdisk
/var/tmp             140038192  42652604  95288436  31% /var/tmp

/etc/my.cnf
[mysqld_safe]
err-log=/var/log/mysqld.log
log-error=/var/log/mysqld.log

[mysqld]
tmpdir=/ramdisk/mysql/tmp
table_cache=16384
query_cache_limit=32M
query_cache_size=256M
query_cache_type=1
max_connections=600
interactive_timeout=100
wait_timeout=60
connect_timeout=10
thread_cache_size=128
key_buffer=256M
join_buffer=4M
max_allowed_packet=32M
sort_buffer_size=4M
read_buffer_size=4M
read_rnd_buffer_size=4M
max_connect_errors=10
# Try number of CPU's*2 for thread_concurrency
thread_concurrency=8
innodb_thread_concurrency=8
myisam_sort_buffer_size=64M
safe-show-database
set-variable=max_user_connections=60
max_join_size=100000000
myisam_block_size=4096

log_slow_queries=/var/log/slow-queries.log
long_query_time=1
innodb_file_per_table=1
log-warnings=2
innodb_flush_log_at_trx_commit=0
innodb_max_purge_lag=100000
innodb_buffer_pool_size=256M
innodb_flush_method=O_DIRECT
max_heap_table_size=268435456
tmp_table_size=268435456
max_tmp_tables=1024

How to repeat:
A standard install of Pligg (pligg.com) running the query "SELECT count(*) FROM pligg_pageviews,pligg_links WHERE `pv_type`='story' AND link_id IN(SELECT
DISTINCT link_id FROM pligg_links WHERE link_id=pv_page_id);" causes MySQL's memory to bloat leading to a server crash.  A dump of a test DB can be provided as needed.
[12 Jan 2009 6:30] Sveta Smirnova
Thank you for the report.

Please send dump of the test database.
[12 Jan 2009 6:55] Sean Jenkins
I've uploaded the requested dump file bug-data-42037.zip along with README.  Please let me know if there's anything further I can provide.
[12 Jan 2009 9:30] Sveta Smirnova
Thank you for the report.

Verified mostly as described. In my case server does not crash, but query fails with "ERROR 5 (HY000): Out of memory at line 201, 'my_alloc.c'" (error message from version 5.1)

Workaround for 4.1 and 5.0:

SELECT COUNT( DISTINCT link_id )  FROM pligg_pageviews,pligg_links WHERE `pv_type`='story' ;
[12 Jan 2009 9:51] Sveta Smirnova
Sorry for typo: workaround from previous comment works with other versions as well.
[12 Jan 2009 10:45] Sveta Smirnova
reduced data for the bug

Attachment: bug42037.sql.tar.gz (application/x-gzip, text), 160.31 KiB.

[12 Jan 2009 18:05] Sean Jenkins
Although the work-around does happen to resolve the issue, and I can submit the suggested work-around to the developers of Pligg, because we are a shared hosting company with thousands of users, asking them all to update their code might prove to be cumbersome.

You stated you were able to recreate the problem in part on 5.1.  If I allow the query to run long enough, I can trigger the malloc.c errors as well, however it would be a while before the box itself is able to recover from all the swapping.  I apologize that I'm personally unable to provide a patch, but will there be a patch forthcoming for this issue for 5.0 and 5.1+?
[13 Jan 2009 1:24] Igor Babaev
Gleb, please analyse this bug and put the result of your analysis in bugsdb.
[13 Jan 2009 16:23] Gleb Shchepa
Minimized test case for "ulimit -v 50000" is:

--disable_warnings
DROP TABLE IF EXISTS t1, t2;
--enable_warnings

CREATE TABLE t1 (a INT);
CREATE TABLE t2 (b INT AUTO_INCREMENT PRIMARY KEY);
--disable_query_log
# 100*10 is enough to fail this test with "ulimit -v 50000"
let $i = 100;
while($i) {
  INSERT INTO t2 VALUES (), (), (), (), (), (), (), (), (), ();
  dec $i;
}
--enable_query_log
INSERT INTO t1 SELECT * FROM t2;

# Next query fails with error 5 (out of memory)
#--error 5 
SELECT count(*) FROM t1,t2 WHERE t2.b IN(SELECT DISTINCT t2.b FROM t2 WHERE t2.b=t1.a);
[13 Jan 2009 18:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/63158

2728 Gleb Shchepa	2009-01-13
      Bug #42037: Queries containing a subquery with DISTINCT and
                  ORDER BY could cause a server crash
      
      Dependent subqueries like
      
        SELECT COUNT(*) FROM t1, t2 WHERE t2.b
         IN (SELECT DISTINCT t2.b FROM t2 WHERE t2.b = t1.a)
      
      hogged memory or even caused server crash on tables with
      significant number of rows (for example, this query failed 
      on 1000 of records in every table and 50M of free
      virtual memory).
      
      Subselect engine resets JOIN structure of dependent inner
      query on every row of outer table. In this case restore_tmp()
      reset tmp_join->join_tab_reexec and tmp_join->table_reexec
      methods to NULL pointers and the next call to make_simple_join()
      reallocated them again and again.
      
      The JOIN::restore_tmp method has been modified to reuse
      previously allocated memory of join_tab_reexec and table_reexec.
      
      NOTE: this patch doesn't include standard test case (this is
      "out of memory" bug). See bug #42037 page for test cases.
[16 Jan 2009 23:19] Sean Jenkins
Confirmed patch resolves issue on 5.0.67 and 5.0.75.  Marking closed.
[19 Jan 2009 7:07] Sveta Smirnova
Bug #42174 was marked as duplicate of this one.
[19 Jan 2009 9:08] Sveta Smirnova
Sean,

thank you for the testing. Please, don't "close" this bug, because we need to do additional actions: test patch if it is of high quality, push to all related trees and document this change before really close the report.
[29 Jan 2009 10:35] Gleb Shchepa
Memory hog lives at make_simple_join(), see 2 allocations below:

  if (!join->table_reexec)
  {
    if (!(join->table_reexec= (TABLE**) join->thd->alloc(sizeof(TABLE*))))
      DBUG_RETURN(TRUE);                        /* purecov: inspected */
    if (join->tmp_join)
      join->tmp_join->table_reexec= join->table_reexec;
  }
  if (!join->join_tab_reexec)
  {
    if (!(join->join_tab_reexec=
          (JOIN_TAB*) join->thd->alloc(sizeof(JOIN_TAB))))
      DBUG_RETURN(TRUE);                        /* purecov: inspected */
    if (join->tmp_join)
      join->tmp_join->join_tab_reexec= join->join_tab_reexec;
  }

Stack dump is:

#0  make_simple_join (join=0x8781928, tmp_table=0x874c438) at sql_select.cc:5430
#1  0x0826767e in JOIN::exec (this=0x8757358) at sql_select.cc:1927
#2  0x0819f138 in subselect_single_select_engine::exec (this=0x8750c30) at item_subselect.cc:1876
#3  0x0819ccaa in Item_subselect::exec (this=0x8750b80) at item_subselect.cc:212
#4  0x0819d5c6 in Item_in_subselect::val_bool (this=0x8750b80) at item_subselect.cc:775
#5  0x08141568 in Item::val_bool_result (this=0x8750b80) at item.h:654
#6  0x0817045e in Item_in_optimizer::val_int (this=0x8751be0) at item_cmpfunc.cc:1536
#7  0x082689dc in SQL_SELECT::skip_record (this=0x8758cf0) at opt_range.h:692
#8  0x08248f52 in flush_cached_records (join=0x8750cd8, join_tab=0x8758a68, skip_last=false) at sql_select.cc:10946
#9  0x0824906e in sub_select_cache (join=0x8750cd8, join_tab=0x8758a68, end_of_records=true) at sql_select.cc:10509
#10 0x0824852b in sub_select (join=0x8750cd8, join_tab=0x87588e8, end_of_records=true) at sql_select.cc:10648
#11 0x082508a2 in do_select (join=0x8750cd8, fields=0x8751a60, table=0x0, procedure=0x0) at sql_select.cc:10449
#12 0x08268165 in JOIN::exec (this=0x8750cd8) at sql_select.cc:2109
#13 0x08263ab5 in mysql_select (thd=0x8717398, rref_pointer_array=0x8718458, tables=0x874fee8, wild_num=0, fields=@0x87183c8, conds=0x8750b80, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2156153344, result=0x8750cc0, unit=0x8718104, select_lex=0x8718334) at sql_select.cc:2288
#14 0x0826848a in handle_select (thd=0x8717398, lex=0x87180ac, result=0x8750cc0, setup_tables_done_option=0) at sql_select.cc:257
#15 0x081f548f in mysql_execute_command (thd=0x8717398) at sql_parse.cc:2789
#16 0x081fdaeb in mysql_parse (thd=0x8717398, inBuf=0x874fca0 "SELECT count(*) FROM t1,t2 WHERE t2.b IN(SELECT DISTINCT t2.b FROM t2 WHERE t2.b=t1.a)", length=86, found_semicolon=0xb73d01b0) at sql_parse.cc:6266

JOIN::restore_tmp() overwrites JOIN::tmp_join->table_reexec and JOIN::join->join_tab_reexec with empty pointers from this->table_reexec and this->join_tab_reexec:

void JOIN::restore_tmp()
{
  memcpy(tmp_join, this, (size_t) sizeof(JOIN));
}

Stack trace is:

#0  JOIN::restore_tmp (this=0x8757358) at sql_select.cc:1487
#1  0x0824fea4 in JOIN::reinit (this=0x8757358) at sql_select.cc:1523
#2  0x0819ef37 in subselect_single_select_engine::exec (this=0x8750c30) at item_subselect.cc:1829
#3  0x0819ccaa in Item_subselect::exec (this=0x8750b80) at item_subselect.cc:212
#4  0x0819d5c6 in Item_in_subselect::val_bool (this=0x8750b80) at item_subselect.cc:775
#5  0x08141568 in Item::val_bool_result (this=0x8750b80) at item.h:654
#6  0x0817045e in Item_in_optimizer::val_int (this=0x8751be0) at item_cmpfunc.cc:1536
#7  0x082689dc in SQL_SELECT::skip_record (this=0x8758cf0) at opt_range.h:692
#8  0x08248f52 in flush_cached_records (join=0x8750cd8, join_tab=0x8758a68, skip_last=false) at sql_select.cc:10946
#9  0x0824906e in sub_select_cache (join=0x8750cd8, join_tab=0x8758a68, end_of_records=true) at sql_select.cc:10509
#10 0x0824852b in sub_select (join=0x8750cd8, join_tab=0x87588e8, end_of_records=true) at sql_select.cc:10648
#11 0x082508a2 in do_select (join=0x8750cd8, fields=0x8751a60, table=0x0, procedure=0x0) at sql_select.cc:10449
#12 0x08268165 in JOIN::exec (this=0x8750cd8) at sql_select.cc:2109
#13 0x08263ab5 in mysql_select (thd=0x8717398, rref_pointer_array=0x8718458, tables=0x874fee8, wild_num=0, fields=@0x87183c8, conds=0x8750b80, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2156153344, result=0x8750cc0, unit=0x8718104, select_lex=0x8718334) at sql_select.cc:2288
#14 0x0826848a in handle_select (thd=0x8717398, lex=0x87180ac, result=0x8750cc0, setup_tables_done_option=0) at sql_select.cc:257
#15 0x081f548f in mysql_execute_command (thd=0x8717398) at sql_parse.cc:2789
#16 0x081fdaeb in mysql_parse (thd=0x8717398, inBuf=0x874fca0 "SELECT count(*) FROM t1,t2 WHERE t2.b IN(SELECT DISTINCT t2.b FROM t2 WHERE t2.b=t1.a)", length=86, found_semicolon=0xb73d01b0) at sql_parse.cc:6266
[29 Jan 2009 17:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/64545

2728 Gleb Shchepa	2009-01-29
      Bug #42037: Queries containing a subquery with DISTINCT and
                  ORDER BY could cause a server crash
      
      Dependent subqueries like
      
        SELECT COUNT(*) FROM t1, t2 WHERE t2.b
         IN (SELECT DISTINCT t2.b FROM t2 WHERE t2.b = t1.a)
      
      hogged memory or even caused server crash on tables with
      significant number of rows (for example, this query failed
      on 1000 of records in every table and 50M of free
      virtual memory).
      
      Subselect engine resets JOIN structure of dependent inner
      query on every row of outer table. In this case restore_tmp()
      reset tmp_join->join_tab_reexec and tmp_join->table_reexec
      methods to NULL pointers and the next call to make_simple_join()
      reallocated them again and again.
      
      The make_simple_join() function has been modified to
      JOIN class method to store join_tab_reexec and
      table_reexec values in the parent join only
      (make_simple_join of tmp_join may access these values
      via 'this' pointer of the parent JOIN).
      
      NOTE: this patch doesn't include standard test case (this is
      "out of memory" bug). See bug #42037 page for test cases.
[30 Jan 2009 11:17] Timour Katchaounov
Review:
The patch looks good, however I have few more suggestions:

1. Instead of treating the new method as being applied on the parent
   JOIN, think of it as applied on the join that is currently passed as
   a parameter. In this way, the only information you need from the parent
   JOIN is the two pointers being allocated in the method. In this way you
   can pass either each of the pointers as a parameter, or change the JOIN
   parameter to "JOIN * parent_join", and pick the two pointers from it.

2. With this inversion of the role of the method, it will be possible to
   directly access all the members being initialized.

3. If you agree with me, change the top comment of the method to:
"Initialize a JOIN as a query execution plan that accesses a single table
via a table scan."

4. If you accept my suggestion above, please change you commit comment
   accordingly.
[30 Jan 2009 15:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/64666

2728 Gleb Shchepa	2009-01-30
      Bug #42037: Queries containing a subquery with DISTINCT and
                  ORDER BY could cause a server crash
      
      Dependent subqueries like
      
        SELECT COUNT(*) FROM t1, t2 WHERE t2.b
         IN (SELECT DISTINCT t2.b FROM t2 WHERE t2.b = t1.a)
      
      hogged memory or even caused server crash on tables with
      significant number of rows (for example, this query failed
      on 1000 of records in every table and 50M of free
      virtual memory).
      
      Subselect engine resets JOIN structure of dependent inner
      query on every row of outer table. In this case restore_tmp()
      reset tmp_join->join_tab_reexec and tmp_join->table_reexec
      methods to NULL pointers and the next call to make_simple_join()
      reallocated them again and again.
      
      The make_simple_join() function has been modified to
      JOIN class method to store join_tab_reexec and
      table_reexec values in the parent join only
      (make_simple_join of tmp_join may access these values
      via 'this' pointer of the parent JOIN).
      
      NOTE: this patch doesn't include standard test case (this is
      "out of memory" bug). See bug #42037 page for test cases.
[2 Feb 2009 8:47] Timour Katchaounov
Patch approved after you fix the following items related to the
patch comments:

* in the patch comment:
Change the two paragraphs: "Dependent subqueries like ... hogged memory ..."
with:
"Dependent subqueries like ... caused a memory leak proportional to the
number of outer rows." This is much shorter.

* In the method comment for "JOIN::make_simple_join()" explain that
  the "parent" parameter is the JOIN that contains the buffers for the
  JOIN_TAB and TABLE objects needed for this join.

* Change the comment for JOIN::table_reexec to explain that make_simple_join
  always creates an execution plan that accesses a single table, thus it is
  sufficient to have a one-element array.
  (Of course try to shorten my explanation).
[2 Feb 2009 16:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/64898

2728 Gleb Shchepa	2009-02-02
      Bug #42037: Queries containing a subquery with DISTINCT and
                  ORDER BY could cause a server crash
      
      Dependent subqueries like
      
        SELECT COUNT(*) FROM t1, t2 WHERE t2.b
         IN (SELECT DISTINCT t2.b FROM t2 WHERE t2.b = t1.a)
      
      caused a memory leak proportional to the
      number of outer rows.
      
      
      The make_simple_join() function has been modified to
      JOIN class method to store join_tab_reexec and
      table_reexec values in the parent join only
      (make_simple_join of tmp_join may access these values
      via 'this' pointer of the parent JOIN).
      
      NOTE: this patch doesn't include standard test case (this is
      "out of memory" bug). See bug #42037 page for test cases.
[6 Feb 2009 9:51] Georgi Kodinov
Updating the target version to 5.0+ : bug is repeatable in 5.0 as well
[6 Feb 2009 13:42] Bugs System
Pushed into 5.0.78 (revid:joro@sun.com-20090206133929-ybpbkkox9wkptopb) (version source revid:gshchepa@mysql.com-20090205093039-ln13nkrg48bu7kc3) (merge vers: 5.0.78) (pib:6)
[9 Feb 2009 22:33] Bugs System
Pushed into 5.1.32 (revid:davi.arnaut@sun.com-20090209214102-gj3sb3ujpnvpiy4c) (version source revid:davi.arnaut@sun.com-20090209214102-gj3sb3ujpnvpiy4c) (merge vers: 5.1.32) (pib:6)
[14 Feb 2009 13:00] Bugs System
Pushed into 6.0.10-alpha (revid:matthias.leich@sun.com-20090212211028-y72faag15q3z3szy) (version source revid:sergey.glukhov@sun.com-20090205111148-9s8861sdpj8c7gqn) (merge vers: 6.0.10-alpha) (pib:6)
[16 Feb 2009 2:12] Paul DuBois
Noted in 5.0.78, 5.1.32, 6.0.10 changelogs.
[17 Feb 2009 14:58] Bugs System
Pushed into 5.1.32-ndb-6.3.23 (revid:tomas.ulin@sun.com-20090217131017-6u8qz1edkjfiobef) (version source revid:tomas.ulin@sun.com-20090216083408-rmvyaxjt6mk8sg1y) (merge vers: 5.1.32-ndb-6.3.23) (pib:6)
[17 Feb 2009 16:46] Bugs System
Pushed into 5.1.32-ndb-6.4.3 (revid:tomas.ulin@sun.com-20090217134419-5ha6xg4dpedrbmau) (version source revid:tomas.ulin@sun.com-20090216083646-m8st11oj1hhfuuh5) (merge vers: 5.1.32-ndb-6.4.3) (pib:6)
[17 Feb 2009 18:22] Bugs System
Pushed into 5.1.32-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090217134216-5699eq74ws4oxa0j) (version source revid:tomas.ulin@sun.com-20090211111208-wf0acl7c1vl5653e) (merge vers: 5.1.32-ndb-6.2.17) (pib:6)
[24 Jun 2009 4:10] James Day
Gleb, Does the "Dependent subqueries [of certain form] caused a memory leak proportional to the number of outer rows" leak persist after the query is finished running and the connection is closed?

I'm trying to find out whether this bug could cause server memory usage to continue to increase over time as more and more queries are run or whether it's just temporary.
[24 Jun 2009 15:06] Gleb Shchepa
[24 Jun 6:10] James Day

> Gleb, Does the "Dependent subqueries [of certain form] caused
> a memory leak proportional to the number of outer rows" leak
> persist after the query is finished running and the connection
> is closed?

James, it was a memory leak at the query memory, i.e. it was freed
after disconnect.
[24 Jun 2009 18:49] James Day
Gleb, thanks. Good news.