| Bug #47123 | Endless 100% CPU loop with STRAIGHT_JOIN | ||
|---|---|---|---|
| Submitted: | 4 Sep 2009 7:13 | Modified: | 12 Mar 2010 17:32 |
| Reporter: | Philip Stoev | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
| Version: | 5.0, 5.1, 5.4 | OS: | Any |
| Assigned to: | Alexey Kopytov | CPU Architecture: | Any |
| Tags: | join_cache_level | ||
[4 Sep 2009 8:14]
Sveta Smirnova
Thank you for the report. Verified as described.
[13 Oct 2009 8:52]
Alexey Kopytov
A shorter testcase: create table t1(a int, key(a)); insert into t1 values (1), (NULL); select * from t1 where (a <> NULL) and (a <> NULL or a <= NULL);
[15 Oct 2009 11:49]
Bugs System
Pushed into 5.1.41 (revid:joro@sun.com-20091015114812-i3gm8km6gfruny5x) (version source revid:alexey.kopytov@sun.com-20091015104251-0sjkte1joim4pfns) (merge vers: 5.1.41) (pib:13)
[15 Oct 2009 23:38]
Paul DuBois
Noted in 5.1.41 changelog. Incorrect handling of predicates involving NULL by the range optimizer could lead to to an infinite loop during query execution. Setting report to NDI pending push into 5.5.x+.
[16 Oct 2009 8:45]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/87079 3175 Georgi Kodinov 2009-10-16 Revert the fix for bug #47123 until test suite failures are resolved.
[16 Oct 2009 20:20]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/87197 3181 Alexey Kopytov 2009-10-17 Bug #47123: Endless 100% CPU loop with STRAIGHT_JOIN The problem was in incorrect handling of predicates involving NULL as a constant value by the range optimizer. For example, when creating a SEL_ARG node from a condition of the form "field < const" (which would normally result in the "NULL < field < const" SEL_ARG), the special case when "const" is NULL was not taken into account, so "NULL < field < NULL" was produced for the "field < NULL" condition. As a result, SEL_ARG structures of this form could not be further optimized which in turn could lead to incorrectly constructed SEL_ARG trees. In particular, code assuming SEL_ARG structures to always form a sequence of ordered disjoint intervals could enter an infinite loop under some circumstances. Fixed by changing get_mm_leaf() so that for any sargable predicate except "<=>" involving NULL as a constant, "empty" SEL_ARG is returned, since such a predicate is always false. @ mysql-test/r/partition_pruning.result Fixed a broken test case. @ mysql-test/r/range.result Added a test case for bug #47123. @ mysql-test/r/subselect.result Fixed a broken test cases. @ mysql-test/t/range.test Added a test case for bug #47123. @ sql/opt_range.cc Fixed get_mm_leaf() so that for any sargable predicate except "<=>" involving NULL as a constant, "empty" SEL_ARG is returned, since such a predicate is always false.
[22 Oct 2009 6:32]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091022063126-l0qzirh9xyhp0bpc) (version source revid:alik@sun.com-20091019135554-s1pvptt6i750lfhv) (merge vers: 6.0.14-alpha) (pib:13)
[22 Oct 2009 7:05]
Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091022060553-znkmxm0g0gm6ckvw) (version source revid:alik@sun.com-20091019132831-5j6xs19ld16m7r8j) (merge vers: 5.5.0-beta) (pib:13)
[22 Oct 2009 19:18]
Paul DuBois
Noted in 5.5.0, 6.0.14 changelogs.
[4 Nov 2009 9:24]
Bugs System
Pushed into 5.1.41 (revid:joro@sun.com-20091104092152-qz96bzlf2o1japwc) (version source revid:alexey.kopytov@sun.com-20091016201951-fsht0wm8xn8vkzsx) (merge vers: 5.1.41) (pib:13)
[11 Nov 2009 6:48]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091110093407-rw5g8dys2baqkt67) (version source revid:alik@sun.com-20091109080109-7dxapd5y5pxlu08w) (merge vers: 6.0.14-alpha) (pib:13)
[11 Nov 2009 6:56]
Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091109115615-nuohp02h8mdrz8m2) (version source revid:alik@sun.com-20091105090203-cls5j6k3ohu04xpt) (merge vers: 5.5.0-beta) (pib:13)
[1 Dec 2009 5:27]
Igor Babaev
Alexey, After your patch, for your test case, we have: mysql> EXPLAIN SELECT * FROM t1 WHERE a <> NULL; +----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+ | 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE noticed after reading const tables | +----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+ 1 row in set (10.68 sec) Yet, if we drop the index we get: mysql > drop index a on t1; Query OK, 4 rows affected (0.03 sec) Records: 4 Duplicates: 0 Warnings: 0 (I've added two extra records to t1 with values '2' and '3'). mysql> EXPLAIN SELECT * FROM t1 WHERE a <> NULL; +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 4 | Using where | +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ This seems to me to be counter-intuitive. I would expect to get "Impossible where" in this case too. I think it happens because the null-rejecting predicates returning always UNKNOWN must be caught much earlier, somewhere in optimize_cond(). In this case for the following query: mysql> EXPLAIN SELECT * FROM t1, t1 t2 WHERE t1.a <> NULL or t2.a <> NULL; +----+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------+ | 1 | SIMPLE | t1 | index | a | a | 5 | NULL | 4 | Using index | | 1 | SIMPLE | t2 | index | a | a | 5 | NULL | 4 | Using where; Using index; Using join buffer | +----+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------+ 2 rows in set (9.85 sec) you also would get "Impossible where".
[3 Dec 2009 13:19]
Alexey Kopytov
Igor, I agree that the inconsistency with indexed vs. non-indexed column must be fixed. However, I think this should be done in addition to the patch for this bug, not instead of it. If we catch null-rejecting predicates only in optimize_cond(), then the following query will still trigger the bug (i.e. the endless loop in the range optimizer): SELECT * FROM t1, t1 t2 WHERE (t2.a <> t1.a) AND (t2.a <> t1.a or t2.a <= t1.a); So I think we still must catch such predicates in get_mm_leaf() as done in my patch, and do the same in optimize_cond() to be consistent in the non-indexed column case. I have a draft version of the patch for optimize_cond() ready. Please let me know if you agree, I will then apply it as a fix for a separate bug.
[7 Dec 2009 12:55]
Alexey Kopytov
Reported the inconsistency with indexed vs. non-indexed columns separately, bug #49504.
[7 Dec 2009 16:41]
Paul DuBois
Noted in 5.1.40sp1 changelog.
[8 Dec 2009 9:30]
Bugs System
Pushed into 5.1.43 (revid:build@mysql.com-20091208092611-pbno5awyb0v38hs7) (version source revid:build@mysql.com-20091208092611-pbno5awyb0v38hs7) (merge vers: 5.1.43) (pib:13)
[16 Dec 2009 8:34]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091216083311-xorsasf5kopjxshf) (version source revid:alik@sun.com-20091215065750-5m04ogppd5l0pol5) (merge vers: 6.0.14-alpha) (pib:14)
[16 Dec 2009 8:42]
Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091216082430-s0gtzibcgkv4pqul) (version source revid:alik@sun.com-20091211070127-kl8uvlrv9cr11kva) (merge vers: 5.5.0-beta) (pib:14)
[16 Dec 2009 8:48]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20091216083231-rp8ecpnvkkbhtb27) (version source revid:alik@sun.com-20091212203859-fx4rx5uab47wwuzd) (merge vers: 5.6.0-beta) (pib:14)
[18 Dec 2009 10:26]
Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6) (version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers: 5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 10:42]
Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt) (version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers: 5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 10:57]
Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0) (version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers: 5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 11:12]
Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606) (version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers: 5.1.41-ndb-7.0.11) (pib:15)
[12 Mar 2010 14:06]
Bugs System
Pushed into 5.1.44-ndb-7.0.14 (revid:jonas@mysql.com-20100312135944-t0z8s1da2orvl66x) (version source revid:jonas@mysql.com-20100312115609-woou0te4a6s4ae9y) (merge vers: 5.1.44-ndb-7.0.14) (pib:16)
[12 Mar 2010 14:22]
Bugs System
Pushed into 5.1.44-ndb-6.2.19 (revid:jonas@mysql.com-20100312134846-tuqhd9w3tv4xgl3d) (version source revid:jonas@mysql.com-20100312060623-mx6407w2vx76h3by) (merge vers: 5.1.44-ndb-6.2.19) (pib:16)
[12 Mar 2010 14:36]
Bugs System
Pushed into 5.1.44-ndb-6.3.33 (revid:jonas@mysql.com-20100312135724-xcw8vw2lu3mijrhn) (version source revid:jonas@mysql.com-20100312103652-snkltsd197l7q2yg) (merge vers: 5.1.44-ndb-6.3.33) (pib:16)

Description: Under default settings, the following query: SELECT table1 . `pk` AS field1 , table1 . `pk` AS field2 , table2 . `int_key` AS field3 FROM ( CC AS table1 STRAIGHT_JOIN ( ( C AS table2 STRAIGHT_JOIN A AS table3 ON (( table3 . `varchar_key` <> table2 . `varchar_key` ) OR ( table3 . `varchar_key` <= table2 . `varchar_key` ) ) ) ) ON (table3 . `varchar_key` <> table2 . `varchar_nokey` ) ); Causes a 100% CPU loop with the following backtrace: #0 0x00000000007ae8d5 in test_rb_tree (element=0x2249490, parent=0x2249550) at opt_range.cc:7132 #1 0x00000000007ae8c8 in test_rb_tree (element=0x2249550, parent=0x22493d0) at opt_range.cc:7131 #2 0x00000000007ae8b4 in test_rb_tree (element=0x22493d0, parent=0x22490d0) at opt_range.cc:7130 #3 0x00000000007ae8b4 in test_rb_tree (element=0x22490d0, parent=0x22496d0) at opt_range.cc:7130 #4 0x00000000007ae8c8 in test_rb_tree (element=0x22496d0, parent=0x224a2d0) at opt_range.cc:7131 #5 0x00000000007ae8c8 in test_rb_tree (element=0x224a2d0, parent=0x224bad0) at opt_range.cc:7131 #6 0x00000000007ae8c8 in test_rb_tree (element=0x224bad0, parent=0x2248ad0) at opt_range.cc:7131 #7 0x00000000007ae8b4 in test_rb_tree (element=0x2248ad0, parent=0x2242a80) at opt_range.cc:7130 #8 0x00000000007ae8b4 in test_rb_tree (element=0x2242a80, parent=0x2236a80) at opt_range.cc:7130 #9 0x00000000007ae8b4 in test_rb_tree (element=0x2236a80, parent=0x224ead0) at opt_range.cc:7130 #10 0x00000000007ae8c8 in test_rb_tree (element=0x224ead0, parent=0x227ebc0) at opt_range.cc:7131 #11 0x00000000007ae8c8 in test_rb_tree (element=0x227ebc0, parent=0x221e9e0) at opt_range.cc:7131 #12 0x00000000007ae8b4 in test_rb_tree (element=0x221e9e0, parent=0x215e580) at opt_range.cc:7130 #13 0x00000000007ae8b4 in test_rb_tree (element=0x215e580, parent=0x209de70) at opt_range.cc:7130 #14 0x00000000007ae8b4 in test_rb_tree (element=0x209de70, parent=0x0) at opt_range.cc:7130 #15 0x00000000007aeae9 in SEL_ARG::rb_insert (this=0x209de70, leaf=0x23528a0) at opt_range.cc:7020 #16 0x00000000007aec58 in SEL_ARG::insert (this=0x209de70, key=0x23528a0) at opt_range.cc:6815 #17 0x00000000007b0dbc in key_and (param=0x7f9b78015bb0, key1=0x1fc1cc8, key2=0x1fc24d8, clone_flag=3) at opt_range.cc:6460 #18 0x00000000007b1067 in tree_and (param=0x7f9b78015bb0, tree1=0x1fc1a78, tree2=0x1fc1fd8) at opt_range.cc:6071 #19 0x00000000007b32c2 in get_mm_tree (param=0x7f9b78015bb0, cond=0x1f6bee0) at opt_range.cc:5431 #20 0x00000000007b8cd4 in SQL_SELECT::test_quick_select (this=0x1f6c0b8, thd=0x7f9b7cc67178, keys_to_use={map = 32}, prev_tables=0, limit=18446744073709551615, force_quick_range=false, ordered_output=false) at opt_range.cc:2407 #21 0x00000000006f8498 in test_if_quick_select (tab=0x1f6b990) at sql_select.cc:17047 #22 0x00000000006f8844 in join_init_quick_read_record (tab=0x1f6b990) at sql_select.cc:17027 #23 0x00000000006fc021 in sub_select (join=0x1f653c0, join_tab=0x1f6b990, end_of_records=false) at sql_select.cc:16257 #24 0x00000000006c832c in JOIN_CACHE::generate_full_extensions (this=0x1f6c470, rec_ptr=0x1f8daf8 "\fЭ\n") at sql_join_cache.cc:1908 #25 0x00000000006c8d1a in JOIN_CACHE_BNL::join_matching_records (this=0x1f6c470, skip_last=false) at sql_join_cache.cc:1801 #26 0x00000000006c70e7 in JOIN_CACHE::join_records (this=0x1f6c470, skip_last=false) at sql_join_cache.cc:1617 #27 0x00000000006fc28f in sub_select_cache (join=0x1f653c0, join_tab=0x1f6b6f0, end_of_records=true) at sql_select.cc:16060 #28 0x00000000006fbf1a in sub_select (join=0x1f653c0, join_tab=0x1f6b450, end_of_records=true) at sql_select.cc:16218 #29 0x0000000000709b4c in do_select (join=0x1f653c0, fields=0x0, table=0x1e4f250, procedure=0x0) at sql_select.cc:15823 #30 0x0000000000722593 in JOIN::exec (this=0x1f653c0) at sql_select.cc:2478 #31 0x000000000071e92a in mysql_select (thd=0x7f9b7cc67178, rref_pointer_array=0x7f9b7cc69358, tables=0x1ef1600, wild_num=0, fields=@0x7f9b7cc69278, conds=0x0, og_num=4, order=0x1f632b0, group=0x0, having=0x0, proc_param=0x0, select_options=2147764736, result=0x1f63e78, unit=0x7f9b7cc68ae0, select_lex=0x7f9b7cc69170) at sql_select.cc:3091 #32 0x00000000007241c1 in handle_select (thd=0x7f9b7cc67178, lex=0x7f9b7cc68a40, result=0x1f63e78, setup_tables_done_option=0) at sql_select.cc:306 #33 0x000000000067d4e4 in execute_sqlcom_select (thd=0x7f9b7cc67178, all_tables=0x1ef1600) at sql_parse.cc:4935 #34 0x000000000067ed43 in mysql_execute_command (thd=0x7f9b7cc67178) at sql_parse.cc:2112 #35 0x000000000068756a in mysql_parse (thd=0x7f9b7cc67178, inBuf=0x1ef08a0 "SELECT table1 . `pk` AS field1 , table1 . `pk` AS field2 , table2 . `int_key` AS field3 FROM ( CC AS table1 STRAIGHT_JOIN ( ( C AS table2 STRAIGHT_JOIN A AS table3 ON (( table3 . `varchar_key` <> ta"..., length=402, found_semicolon=0x7f9b7801af00) at sql_parse.cc:5950 #36 0x0000000000688185 in dispatch_command (command=COM_QUERY, thd=0x7f9b7cc67178, packet=0x7f9b7cca28c9 "SELECT table1 . `pk` AS field1 , table1 . `pk` AS field2 , table2 . `int_key` AS field3 FROM ( CC AS table1 STRAIGHT_JOIN ( ( C AS table2 STRAIGHT_JOIN A AS table3 ON (( table3 . `varchar_key` <> ta"..., packet_length=402) at sql_parse.cc:1062 #37 0x0000000000689651 in do_command (thd=0x7f9b7cc67178) at sql_parse.cc:744 ---Type <return> to continue, or q <return> to quit--- #38 0x000000000067688b in handle_one_connection (arg=0x7f9b7cc67178) at sql_connect.cc:1163 #39 0x000000315b0073da in start_thread () from /lib64/libpthread.so.0 #40 0x000000315a4e627d in clone () from /lib64/libc.so.6 The loop is in the following code section: 7132 if (count_l >= 0 && count_r >= 0) (gdb) 7134 if (count_l == count_r) (gdb) 7135 return count_l+(element->color == SEL_ARG::BLACK); The query could not be evaluated within 2 hours. Under 5.1, the same issue is observed, with a slightly different stack trace (no join cache). The EXPLAIN looks like this: +----+-------------+--------+-------+---------------+---------+---------+------+------+-------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------+-------+---------------+---------+---------+------+------+-------------------------------------------------+ | 1 | SIMPLE | table1 | index | NULL | PRIMARY | 4 | NULL | 20 | Using index | | 1 | SIMPLE | table2 | ALL | C_varchar_key | NULL | NULL | NULL | 21 | Using join buffer | | 1 | SIMPLE | table3 | ALL | A_varchar_key | NULL | NULL | NULL | 0 | Range checked for each record (index map: 0x20) | +----+-------------+--------+-------+---------------+---------+---------+------+------+-------------------------------------------------+ Replacing both STRAIGHT_JOINs with JOIN causes the query to return an empty result immediately and EXPLAIN shows "impossible where". Removing the ON conditions also causes swift evaluation. Unfortunately, due to the nature of this issue, automatic query simplification could not be applied. How to repeat: CREATE TABLE `A` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_key` int(11) DEFAULT NULL, `varchar_key` varchar(1) DEFAULT NULL, `varchar_nokey` varchar(1) DEFAULT NULL, PRIMARY KEY (`pk`), KEY `A_int_key` (`int_key`), KEY `A_varchar_key` (`varchar_key`,`int_key`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1; CREATE TABLE `CC` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_key` int(11) DEFAULT NULL, `varchar_key` varchar(1) DEFAULT NULL, `varchar_nokey` varchar(1) DEFAULT NULL, PRIMARY KEY (`pk`), KEY `CC_int_key` (`int_key`), KEY `CC_varchar_key` (`varchar_key`,`int_key`) ) ENGINE=MyISAM AUTO_INCREMENT=30 DEFAULT CHARSET=latin1; INSERT INTO `CC` VALUES (10,8,'v','v'),(11,9,'r','r'),(12,9,'a','a'),(13,186,'m','m'),(14,NULL,'y','y'),(15,2,'j','j'),(16,3,'d','d'),(17,0,'z','z'),(18,133,'e','e'),(19,1,'h','h'),(20,8,'b','b'),(21,5,'s','s'),(22,5,'e','e'),(23,8,'j','j'),(24,6,'e','e'),(25,51,'f','f'),(26,4,'v','v'),(27,7,'x','x'),(28,6,'m','m'),(29,4,'c','c'); CREATE TABLE `C` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_key` int(11) DEFAULT NULL, `varchar_key` varchar(1) DEFAULT NULL, `varchar_nokey` varchar(1) DEFAULT NULL, PRIMARY KEY (`pk`), KEY `C_int_key` (`int_key`), KEY `C_varchar_key` (`varchar_key`,`int_key`) ) ENGINE=MyISAM AUTO_INCREMENT=22 DEFAULT CHARSET=latin1; INSERT INTO `C` VALUES (1,2,'w','w'),(2,9,'m','m'),(3,3,'m','m'),(4,9,'k','k'),(5,NULL,'r','r'),(6,9,'t','t'),(7,3,'j','j'),(8,8,'u','u'),(9,8,'h','h'),(10,53,'o','o'),(11,0,NULL,NULL),(12,5,'k','k'),(13,166,'e','e'),(14,3,'n','n'),(15,0,'t','t'),(16,1,'c','c'),(17,9,'m','m'),(18,5,'y','y'),(19,6,'f','f'),(20,2,'d','d'),(21,NULL,NULL,NULL); SELECT table1 . `pk` AS field1 , table1 . `pk` AS field2 , table2 . `int_key` AS field3 FROM ( CC AS table1 STRAIGHT_JOIN ( ( C AS table2 STRAIGHT_JOIN A AS table3 ON (( table3 . `varchar_key` <> table2 . `varchar_key` ) OR ( table3 . `varchar_key` <= table2 . `varchar_key` ) ) ) ) ON (table3 . `varchar_key` <> table2 . `varchar_nokey` ) );