| Bug #52540 | Crash in JOIN_CACHE::set_match_flag_if_none () at sql_join_cache.cc:1883 | ||
|---|---|---|---|
| Submitted: | 1 Apr 2010 20:27 | Modified: | 21 Nov 2010 19:16 |
| Reporter: | Patrick Crews | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | mysql-6.0-codebase-bugfixing | OS: | Any |
| Assigned to: | Guilhem Bichot | CPU Architecture: | Any |
| Tags: | assertion, crashing bug, optimizer_join_cache_level, optimizer_switch | ||
[1 Apr 2010 20:30]
Patrick Crews
Full crash output
Attachment: bug52540_crash_output.txt (text/plain), 12.34 KiB.
[1 Apr 2010 20:30]
Patrick Crews
Full MTR test case with original + simplified queries (simplified=query0)
Attachment: bug52540_test.txt (text/plain), 20.13 KiB.
[1 Apr 2010 22:12]
Sveta Smirnova
Thank you for the report. Verified as described.
[3 May 2010 8:19]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/107162 3846 Guilhem Bichot 2010-05-03 Fix for BUG#52540 "Crash in JOIN_CACHE::set_match_flag_if_none () at sql_join_cache.cc:1883" @ mysql-test/t/join_cache.test minimal test case for bug @ sql/sql_select.cc As said in a comment in this file: "For a nested outer join/semi-join, currently, we either use join buffers for all inner tables or for none of them.". This is enforced by check_join_cache_usage(): for (JOIN_TAB *first_inner= tab->first_inner; first_inner; first_inner= first_inner->first_upper) { if (first_inner != tab && !first_inner->use_join_cache) goto no_join_cache; } i.e. if the first table in this join nest doesn't use join cache, or the first table of the upper containing join nest doesn't, go to label "no_join_cache" and thus: 1) don't use join cache for the current table 2) disable join cache for all previous tables of this nest and upper containing nests (via revise_cache_usage()). In the bug's scenario, before the fix, EXPLAIN showed this plan: t2, t1(with join buffering), t3, t4(with join buffering). * join buffering was decided for t1 * it was rejected for t3, because t3 is accessed with EQ_REF (join buffering with EQ_REF requires BKA, which we don't have at levels <=4): check_join_cache_usage() did "return 0", forgetting to disable join buffering for previous tables of the join nest i.e. t1 * join buffering was decided for t4, and allowed as it was observed that the first table of the upper nest (t1) had join buffering (so it was believed that all tables in between had join buffering). * at execution time, JOIN_CACHE:set_match_flag_if_none() would walk back the list of join buffers from t4 to t1: while (cache->join_tab != first_inner) { cache= cache->prev_cache; DBUG_ASSERT(cache); But t4 had prev_cache==0 (as previous table t1 wasn't doing join buffering), hence the assertion failure. This loop above assumed again that all previous tables should be doing join buffering. The fix is to make sure that when join buffering is rejected because of EQ_REF, we disable join buffering for previous tables, so that assumptions are true.
[10 May 2010 11:53]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/107834 3873 Guilhem Bichot 2010-05-10 Fix for BUG#52540 "Crash in JOIN_CACHE::set_match_flag_if_none () at sql_join_cache.cc:1883" @ mysql-test/t/join_cache.test minimal test case for bug @ sql/sql_select.cc As said in a comment in this file: "For a nested outer join/semi-join, currently, we either use join buffers for all inner tables or for none of them.". This is enforced by check_join_cache_usage(): for (JOIN_TAB *first_inner= tab->first_inner; first_inner; first_inner= first_inner->first_upper) { if (first_inner != tab && !first_inner->use_join_cache) goto no_join_cache; } i.e. if the first table in this join nest doesn't use join cache, or the first table of the upper containing join nest doesn't, go to label "no_join_cache" and thus: 1) don't use join cache for the current table 2) disable join cache for all previous tables of this nest and upper containing nests (via revise_cache_usage()). In the bug's scenario, before the fix, EXPLAIN showed this plan: t2, t1(with join buffering), t3, t4(with join buffering). * join buffering was decided for t1 * it was rejected for t3, because t3 is accessed with EQ_REF (join buffering with EQ_REF requires BKA, which we don't have at levels <=4): check_join_cache_usage() did "return 0", forgetting to disable join buffering for previous tables of the join nest i.e. t1 * join buffering was decided for t4, and allowed as it was observed that the first table of the upper nest (t1) had join buffering (so it was believed that all tables in between had join buffering). * at execution time, JOIN_CACHE:set_match_flag_if_none() would walk back the list of join buffers from t4 to t1: while (cache->join_tab != first_inner) { cache= cache->prev_cache; DBUG_ASSERT(cache); But t4 had prev_cache==0 (as previous table t1 wasn't doing join buffering), hence the assertion failure. This loop above assumed again that all previous tables should be doing join buffering. The fix is to make sure that when join buffering is rejected because of EQ_REF, we disable join buffering for previous tables, so that assumptions are true.
[10 May 2010 11:53]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/107835 3846 Guilhem Bichot 2010-05-10 Fix for BUG#52540 "Crash in JOIN_CACHE::set_match_flag_if_none () at sql_join_cache.cc:1883" @ mysql-test/t/join_cache.test minimal test case for bug @ sql/sql_select.cc As said in a comment in this file: "For a nested outer join/semi-join, currently, we either use join buffers for all inner tables or for none of them.". This is enforced by check_join_cache_usage(): for (JOIN_TAB *first_inner= tab->first_inner; first_inner; first_inner= first_inner->first_upper) { if (first_inner != tab && !first_inner->use_join_cache) goto no_join_cache; } i.e. if the first table in this join nest doesn't use join cache, or the first table of the upper containing join nest doesn't, go to label "no_join_cache" and thus: 1) don't use join cache for the current table 2) disable join cache for all previous tables of this nest and upper containing nests (via revise_cache_usage()). In the bug's scenario, before the fix, EXPLAIN showed this plan: t2, t1(with join buffering), t3, t4(with join buffering). * join buffering was decided for t1 * it was rejected for t3, because t3 is accessed with EQ_REF (join buffering with EQ_REF requires BKA, which we don't have at levels <=4): check_join_cache_usage() did "return 0", forgetting to disable join buffering for previous tables of the join nest i.e. t1 * join buffering was decided for t4, and allowed as it was observed that the first table of the upper nest (t1) had join buffering (so it was believed that all tables in between had join buffering). * at execution time, JOIN_CACHE:set_match_flag_if_none() would walk back the list of join buffers from t4 to t1: while (cache->join_tab != first_inner) { cache= cache->prev_cache; DBUG_ASSERT(cache); But t4 had prev_cache==0 (as previous table t1 wasn't doing join buffering), hence the assertion failure. This loop above assumed again that all previous tables should be doing join buffering. The fix is to make sure that when join buffering is rejected because of EQ_REF, we disable join buffering for previous tables, so that assumptions are true.
[11 May 2010 8:25]
Guilhem Bichot
queued to 6.0-codebase-bugfixing
[14 May 2010 15:58]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/108366 3160 Guilhem Bichot 2010-05-10 Fix for BUG#52540 "Crash in JOIN_CACHE::set_match_flag_if_none () at sql_join_cache.cc:1883" (Backporting of guilhem@mysql.com-20100510092844-9fjd9g9ivk7258oj ) @ mysql-test/t/join_cache.test minimal test case for bug @ sql/sql_select.cc As said in a comment in this file: "For a nested outer join/semi-join, currently, we either use join buffers for all inner tables or for none of them.". This is enforced by check_join_cache_usage(): for (JOIN_TAB *first_inner= tab->first_inner; first_inner; first_inner= first_inner->first_upper) { if (first_inner != tab && !first_inner->use_join_cache) goto no_join_cache; } i.e. if the first table in this join nest doesn't use join cache, or the first table of the upper containing join nest doesn't, go to label "no_join_cache" and thus: 1) don't use join cache for the current table 2) disable join cache for all previous tables of this nest and upper containing nests (via revise_cache_usage()). In the bug's scenario, before the fix, EXPLAIN showed this plan: t2, t1(with join buffering), t3, t4(with join buffering). * join buffering was decided for t1 * it was rejected for t3, because t3 is accessed with EQ_REF (join buffering with EQ_REF requires BKA, which we don't have at levels <=4): check_join_cache_usage() did "return 0", forgetting to disable join buffering for previous tables of the join nest i.e. t1 * join buffering was decided for t4, and allowed as it was observed that the first table of the upper nest (t1) had join buffering (so it was believed that all tables in between had join buffering). * at execution time, JOIN_CACHE:set_match_flag_if_none() would walk back the list of join buffers from t4 to t1: while (cache->join_tab != first_inner) { cache= cache->prev_cache; DBUG_ASSERT(cache); But t4 had prev_cache==0 (as previous table t1 wasn't doing join buffering), hence the assertion failure. This loop above assumed again that all previous tables should be doing join buffering. The fix is to make sure that when join buffering is rejected because of EQ_REF, we disable join buffering for previous tables, so that assumptions are true.
[14 May 2010 18:17]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/108376 3161 Guilhem Bichot 2010-05-10 Fix for BUG#52540 "Crash in JOIN_CACHE::set_match_flag_if_none () at sql_join_cache.cc:1883" (Backporting of guilhem@mysql.com-20100510092844-9fjd9g9ivk7258oj ) @ mysql-test/t/join_cache.test minimal test case for bug @ sql/sql_select.cc As said in a comment in this file: "For a nested outer join/semi-join, currently, we either use join buffers for all inner tables or for none of them.". This is enforced by check_join_cache_usage(): for (JOIN_TAB *first_inner= tab->first_inner; first_inner; first_inner= first_inner->first_upper) { if (first_inner != tab && !first_inner->use_join_cache) goto no_join_cache; } i.e. if the first table in this join nest doesn't use join cache, or the first table of the upper containing join nest doesn't, go to label "no_join_cache" and thus: 1) don't use join cache for the current table 2) disable join cache for all previous tables of this nest and upper containing nests (via revise_cache_usage()). In the bug's scenario, before the fix, EXPLAIN showed this plan: t2, t1(with join buffering), t3, t4(with join buffering). * join buffering was decided for t1 * it was rejected for t3, because t3 is accessed with EQ_REF (join buffering with EQ_REF requires BKA, which we don't have at levels <=4): check_join_cache_usage() did "return 0", forgetting to disable join buffering for previous tables of the join nest i.e. t1 * join buffering was decided for t4, and allowed as it was observed that the first table of the upper nest (t1) had join buffering (so it was believed that all tables in between had join buffering). * at execution time, JOIN_CACHE:set_match_flag_if_none() would walk back the list of join buffers from t4 to t1: while (cache->join_tab != first_inner) { cache= cache->prev_cache; DBUG_ASSERT(cache); But t4 had prev_cache==0 (as previous table t1 wasn't doing join buffering), hence the assertion failure. This loop above assumed again that all previous tables should be doing join buffering. The fix is to make sure that when join buffering is rejected because of EQ_REF, we disable join buffering for previous tables, so that assumptions are true.
[20 May 2010 10:04]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100520100225-oe4iuu5kuzsx0knq) (version source revid:alik@sun.com-20100520100057-rmn5y3o3ij726bm7) (merge vers: 6.0.14-alpha) (pib:16)
[20 May 2010 20:58]
Guilhem Bichot
"for values of optimizer_join_cache_level smaller or equal to 4, MySQL could crash in queries with outer joins and indexed columns".
[20 May 2010 22:46]
Paul DuBois
Noted in 6.0.14 changelog.
[16 Aug 2010 6:33]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:17]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[21 Nov 2010 19:16]
Paul DuBois
Noted in 5.6.1 changelog.
[23 Nov 2010 2:16]
Paul DuBois
Correction: No 5.6.1 changelog entry. Bug does not appear in any released 5.6.x version.

Description: Crashing bug in 6.0-codebase-bugfixing tree with optimizer_join_cache_level=4 (not repeatable with other values): Query: Using a mix of Innodb and MyISAM tables SELECT table2 .`col_varchar_1024_utf8_key` FROM H table2 LEFT JOIN F table3 JOIN J table4 ON table3 .`col_int_key` = table4 .`pk` LEFT JOIN O table5 ON table5 .`col_varchar_10_latin1_key` ON table2 .`col_varchar_10_latin1_key` > table3 .`col_varchar_10_latin1_key` ; Crash output: Thread 1 (Thread 19830): #0 0x002fa422 in __kernel_vsyscall () #1 0x004ace93 in __pthread_kill (threadid=2929657712, signo=6) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:64 #2 0x08d1f00f in my_write_core (sig=6) at stacktrace.c:328 #3 0x08422222 in handle_segfault (sig=6) at mysqld.cc:2843 #4 <signal handler called> #5 0x002fa422 in __kernel_vsyscall () #6 0x0013a4d1 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #7 0x0013d932 in *__GI_abort () at abort.c:92 #8 0x00133648 in *__GI___assert_fail (assertion=0x9052310 "cache", file=0x9052270 "sql_join_cache.cc", line=1883, function=0x9052760 "bool JOIN_CACHE::set_match_flag_if_none(JOIN_TAB*, uchar*)") at assert.c:81 #9 0x084c8b8f in JOIN_CACHE::set_match_flag_if_none (this=0xab89228, first_inner=0xab831d4, rec_ptr=0xae97e022 "") at sql_join_cache.cc:1883 #10 0x084c9314 in JOIN_CACHE::join_null_complements (this=0xab89228, skip_last=false) at sql_join_cache.cc:2079 #11 0x084c7fc6 in JOIN_CACHE::join_records (this=0xab89228, skip_last=false) at sql_join_cache.cc:1661 #12 0x0858303f in sub_select_cache (join=0xab83990, join_tab=0xab8354c, end_of_records=true) at sql_select.cc:16405 #13 0x085834ad in sub_select (join=0xab83990, join_tab=0xab83390, end_of_records=true) at sql_select.cc:16568 #14 0x085834ad in sub_select (join=0xab83990, join_tab=0xab831d4, end_of_records=true) at sql_select.cc:16568 #15 0x085830c3 in sub_select_cache (join=0xab83990, join_tab=0xab831d4, end_of_records=true) at sql_select.cc:16407 #16 0x085834ad in sub_select (join=0xab83990, join_tab=0xab83018, end_of_records=true) at sql_select.cc:16568 #17 0x08582127 in do_select (join=0xab83990, fields=0xa9d0900, table=0x0, procedure=0x0) at sql_select.cc:16159 #18 0x085483c7 in JOIN::exec (this=0xab83990) at sql_select.cc:2987 #19 0x085490d1 in mysql_select (thd=0xa9cf360, rref_pointer_array=0xa9d0970, tables=0xa9018f8, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0xa903500, unit=0xa9d03ac, select_lex=0xa9d086c) at sql_select.cc:3184 #20 0x0853944d in handle_select (thd=0xa9cf360, lex=0xa9d0350, result=0xa903500, setup_tables_done_option=0) at sql_select.cc:304 #21 0x08459ca5 in execute_sqlcom_select (thd=0xa9cf360, all_tables=0xa9018f8) at sql_parse.cc:5032 #22 0x08448b0c in mysql_execute_command (thd=0xa9cf360) at sql_parse.cc:2295 #23 0x0845dd5a in mysql_parse (thd=0xa9cf360, inBuf=0xa901538 "SELECT table2 .`col_varchar_1024_utf8_key` \nFROM H table2 LEFT JOIN F table3 JOIN J table4 ON table3 .`col_int_key` = table4 .`pk` LEFT JOIN O table5 ON table5 .`col_varchar_10_latin1_key` O"..., length=276, found_semicolon=0xae9ef910) at sql_parse.cc:6060 #24 0x084439ad in dispatch_command (command=COM_QUERY, thd=0xa9cf360, packet=0xa99d841 "", packet_length=279) at sql_parse.cc:1091 #25 0x084421c7 in do_command (thd=0xa9cf360) at sql_parse.cc:775 #26 0x0843e551 in do_handle_one_connection (thd_arg=0xa9cf360) at sql_connect.cc:1173 #27 0x0843e238 in handle_one_connection (arg=0xa9cf360) at sql_connect.cc:1113 #28 0x08e7a303 in pfs_spawn_thread (arg=0xaa1a6a8) at pfs.cc:1011 #29 0x004a780e in start_thread (arg=0xae9f0770) at pthread_create.c:300 #30 0x001dc8de in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130 How to repeat: MTR test case: Change optimizer_join_cache_level to other values and observe the crash going away. Test case attached as separate file due to space limitations, will include original + simplified queries