Bug #54608 | Query using IN + OR + IS TRUE and IS NULL returns NULL when should be empty | ||
---|---|---|---|
Submitted: | 18 Jun 2010 9:28 | Modified: | 23 Nov 2010 4:02 |
Reporter: | John Embretsen | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | mysql-next-mr-opt-backporting, 6.0.14 | OS: | Any |
Assigned to: | Guilhem Bichot | CPU Architecture: | Any |
Tags: | regression |
[18 Jun 2010 9:28]
John Embretsen
[18 Jun 2010 9:45]
Valeriy Kravchuk
Verified just as described on Mac OS X: valeriy-kravchuks-macbook-pro:6.0-codebase openxs$ bin/mysql -uroot testReading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 6.0.14-alpha-debug Source distribution Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved. This software comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to modify and redistribute it under the GPL v2 license Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> DROP TABLE IF EXISTS t1; Query OK, 0 rows affected (0.05 sec) mysql> mysql> CREATE TABLE t1 ( -> a INT, -> KEY key_a (a) -> ); Query OK, 0 rows affected (0.11 sec) mysql> mysql> INSERT INTO t1 VALUES (1), (2), (NULL); Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> mysql> -- this is OK (Empty): mysql> SELECT a FROM t1 WHERE a IS TRUE AND a IS NULL; Empty set (0.02 sec) mysql> mysql> -- this is also OK (Empty): mysql> SELECT a FROM t1 WHERE a IN (42); Empty set (0.01 sec) mysql> mysql> -- this is strange (NULL): mysql> SELECT a FROM t1 WHERE a IN (42) OR (a IS TRUE AND a IS NULL); +------+ | a | +------+ | NULL | +------+ 1 row in set (0.00 sec) mysql> explain SELECT a FROM t1 WHERE a IN (42) OR (a IS TRUE AND a IS NULL); +----+-------------+-------+-------------+---------------+-------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------------+---------------+-------+---------+-------+------+-------------+ | 1 | SIMPLE | t1 | ref_or_null | key_a | key_a | 5 | const | 2 | Using index | +----+-------------+-------+-------------+---------------+-------+---------+-------+------+-------------+ 1 row in set (0.00 sec)
[24 Jun 2010 7:48]
Jørgen Løland
Verified in 6.0-cb as far back as 2008-03
[2 Jul 2010 9:18]
Jørgen Løland
Initial analysis: It seems that the conditions are wrongly removed in make_cond_for_table_from_pred() in the if() following comment "Remove equalities that are guaranteed to be true by use of 'ref' access method." In next-mr-opt-backporting, test_if_ref(root_cond, (Item_field*) left_item,right_item) returns true, while it returns false in next-mr. The reason for this difference is this part of the patch with revision sp1r-sergefp@mysql.com-20070309210824-13110 WL#2474 "Multi Range Read: Change the default MRR implementation to implement new MRR interface" WL#2475 "Batched range read functions for MyISAM/InnoDb" "Index condition pushdown for MyISAM/InnoDB" combined into one cset ------- Reverting offending part of patch -------- === modified file 'sql/sql_select.cc' --- sql/sql_select.cc 2010-06-25 12:30:30 +0000 +++ sql/sql_select.cc 2010-06-29 11:37:11 +0000 @@ -18690,7 +18690,7 @@ part_of_refkey(TABLE *table,Field *field for (part=0 ; part < ref_parts ; part++,key_part++) if (field->eq(key_part->field) && - !(key_part->key_part_flag & HA_PART_KEY_SEG)) + !(key_part->key_part_flag & (HA_PART_KEY_SEG | HA_NULL_PART))) return table->reginfo.join_tab->ref.items[part]; } return (Item*) 0; === modified file 'sql/table.cc' --- sql/table.cc 2010-06-25 10:53:36 +0000 +++ sql/table.cc 2010-06-29 11:15:16 +0000 @@ -1539,6 +1539,8 @@ static int open_binary_frm(THD *thd, TAB #endif key_part->key_part_flag|= HA_PART_KEY_SEG; } + if (field->real_maybe_null()) + key_part->key_part_flag|= HA_NULL_PART; } keyinfo->usable_key_parts= usable_parts; // Filesort --------------------------------------- Applying the diff above fixes the wrong result reported in this bug, but it also changes the query plan for a number of cases in our test suites. The code above is not a suggested fix, just initial findings.
[4 Jul 2010 15:08]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/112831 3207 Guilhem Bichot 2010-07-04 Fix for BUG#54608 "Query using IN + OR + IS TRUE and IS NULL returns NULL when should be empty" selection condition was wrongly assumed to be satisfied by ref_or_null @ mysql-test/include/null_key.inc test for bug; without the code fix, the two SELECTs would return NULL instead of empty set. @ sql/sql_select.cc In the test's scenario, the WHERE condition is a IN (42) OR (a IS TRUE AND a IS NULL) . If this condition is true then so is a IN (42) OR a IS NULL , thus ref_or_null access is correctly chosen to access t1. But then it is wrongly assumed that this ref_or_null access makes checking the entire condition unneeded. Here's how: - make_join_select() wants to attach conditions to tables (i.e. push as much as possible of the WHERE clause as deep as possible in the operator tree) - for that it calls make_cond_for_table_from_pred("a IN (42) OR (a IS TRUE AND a IS NULL)") ("make a condition to be attached to the table, from this predicate"). That function sees that the condition is "cond1 OR cond2"; it calls make_cond_for_table_from_pred() on cond1 and on cond2, to later recombine the resulting conditions with an OR. Here cond1 is "a IN (42)", cond2 is "a IS TRUE AND a IS NULL". - make_cond_for_table_from_pred(cond1) calls test_if_ref() - test_if_ref() sees that in the ref_or_null access, "a" is the field used for index lookup, 42 is the lookup value, and so any found record automatically satisfies "a IN (42)" - thus make_cond_for_table_from_pred(cond1) believes that cond1 needn't be checked (is guaranteed to be true by the use of ref_or_null) - thus the top-level make_cond_for_table_from_pred() believes that "cond1 OR cond2" == "TRUE or cond2" thus needn't be checked - what test_if_ref() forgot is that 42 is not the _only_ lookup value: at execution, ref_or_null will look up "42" _and_ also look up NULL. ref_or_null does not implement "a IN (42)" but "a IN (42) OR a IS NULL". So a record returned by index lookup could possibly have a being NULL and not satisfy "a IN (42)". Thus checking "a IN (42)" after ref_or_null is still needed! The fix: test_if_ref() should not say that x=y can be replaced by ref_or_null access. Note: this does cause plan changes, where many "Using where" now appear in EXPLAIN for "ref_or_null" access. This actually makes plans closer to what they are in next-mr. next-mr-opt-backporting had lost the "Using where" for many tests (and the bug was introduced) in that commit: sp1r-sergefp@mysql.com-20070309210824-13110 The intention of that commit looks correct: there was some code which made part_of_refkey() pretend that the ref/ref_or_null access referred to no item if the index was on a nullable column; that commit removed this code. Apparently it accidentally exposed a bug in the logic of test_if_ref(). Undoing this Sergey's commit, though it would restore all next-mr plans, would have stronger effects than the present patch: for example it would add "Using where" for a ref access using an index on a nullable column: innodb.innodb_mysql w8 [ fail ] # query sourced from mix1.inc: EXPLAIN SELECT COUNT(*) FROM t2 LEFT JOIN t1 ON t2.fkey = t1.id WHERE t1.name LIKE 'A%'; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE t1 index PRIMARY,name name 23 NULL 3 Using where; Using index -1 SIMPLE t2 ref fkey fkey 5 test.t1.id 1 Using index +1 SIMPLE t2 ref fkey fkey 5 test.t1.id 1 Using where; Using index and that would be a superfluous "using where". Whereas the current patch affects only ref_or_null. Note2: the current patch restores corrects results, but there would be room for future improvements: when the condition is as simple as key_col=constant OR key_col IS NULL, and access method is "ref_or_null", the Optimizer, when attaching this condition to the table, asks itself two questions (one call to make_cond_for_table_from_pred() for each): - does the access method guarantee key_col=constant? - does the access method guarantee key_col IS NULL? The correct answer is "no" for both, thus the plan has "using where". But the access method does guarantee the OR-combined condition, and so in reality "using where" is superfluous and could be dropped. The Optimizer does not see it because it treats the above two questions independently.
[8 Jul 2010 12:48]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/113128 3210 Guilhem Bichot 2010-07-04 Fix for BUG#54608 "Query using IN + OR + IS TRUE and IS NULL returns NULL when should be empty" selection condition was wrongly assumed to be satisfied by ref_or_null @ mysql-test/include/null_key.inc test for bug; without the code fix, the two SELECTs would return NULL instead of empty set. @ sql/sql_select.cc In the test's scenario, the WHERE condition is a IN (42) OR (a IS TRUE AND a IS NULL) . If this condition is true then so is a IN (42) OR a IS NULL , thus ref_or_null access is correctly chosen to access t1. But then it is wrongly assumed that this ref_or_null access makes checking the entire condition unneeded. Here's how: - make_join_select() wants to attach conditions to tables (i.e. push as much as possible of the WHERE clause as deep as possible in the operator tree) - for that it calls make_cond_for_table_from_pred("a IN (42) OR (a IS TRUE AND a IS NULL)") ("make a condition to be attached to the table, from this predicate"). That function sees that the condition is "cond1 OR cond2"; it calls make_cond_for_table_from_pred() on cond1 and on cond2, to later recombine the resulting conditions with an OR. Here cond1 is "a IN (42)", cond2 is "a IS TRUE AND a IS NULL". - make_cond_for_table_from_pred(cond1) calls test_if_ref() - test_if_ref() sees that in the ref_or_null access, "a" is the field used for index lookup, 42 is the lookup value, and so any found record automatically satisfies "a IN (42)" - thus make_cond_for_table_from_pred(cond1) believes that cond1 needn't be checked (is guaranteed to be true by the use of ref_or_null) - thus the top-level make_cond_for_table_from_pred() believes that "cond1 OR cond2" == "TRUE or cond2" thus needn't be checked - what test_if_ref() forgot is that 42 is not the _only_ lookup value: at execution, ref_or_null will look up "42" _and_ also look up NULL. ref_or_null does not implement "a IN (42)" but "a IN (42) OR a IS NULL". So a record returned by index lookup could possibly have a being NULL and not satisfy "a IN (42)". Thus checking "a IN (42)" after ref_or_null is still needed! The fix: test_if_ref() should not say that x=y can be replaced by ref_or_null access. Note: this does cause plan changes, where many "Using where" now appear in EXPLAIN for "ref_or_null" access. This actually makes plans closer to what they are in next-mr. next-mr-opt-backporting had lost the "Using where" for many tests (and the bug was introduced) in that commit: sp1r-sergefp@mysql.com-20070309210824-13110 The intention of that commit looks correct: there was some code which made part_of_refkey() pretend that the ref/ref_or_null access referred to no item if the index was on a nullable column; that commit removed this code. Apparently it accidentally exposed a bug in the logic of test_if_ref(). Undoing this Sergey's commit, though it would restore all next-mr plans, would have stronger effects than the present patch: for example it would add "Using where" for a ref access using an index on a nullable column: innodb.innodb_mysql w8 [ fail ] # query sourced from mix1.inc: EXPLAIN SELECT COUNT(*) FROM t2 LEFT JOIN t1 ON t2.fkey = t1.id WHERE t1.name LIKE 'A%'; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE t1 index PRIMARY,name name 23 NULL 3 Using where; Using index -1 SIMPLE t2 ref fkey fkey 5 test.t1.id 1 Using index +1 SIMPLE t2 ref fkey fkey 5 test.t1.id 1 Using where; Using index and that would be a superfluous "using where". Whereas the current patch affects only ref_or_null. Note2: the current patch restores corrects results, but there would be room for future improvements: when the condition is as simple as key_col=constant OR key_col IS NULL, and access method is "ref_or_null", the Optimizer, when attaching this condition to the table, asks itself two questions (one call to make_cond_for_table_from_pred() for each): - does the access method guarantee key_col=constant? - does the access method guarantee key_col IS NULL? The correct answer is "no" for both, thus the plan has "using where". But the access method does guarantee the OR-combined condition, and so in reality "using where" is superfluous and could be dropped. The Optimizer does not see it because it treats the above two questions independently.
[8 Jul 2010 20:12]
Guilhem Bichot
queued to next-mr-opt-team and next-mr-opt-backporting
[16 Aug 2010 6:31]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:03]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[23 Nov 2010 4:02]
Paul DuBois
Bug does not appear in any released 5.6.x version. No 5.6.1 changelog entry needed.