Bug #58626 Incorrect result for WHERE <column> IN (<subquery>) IS UNKNOWN
Submitted: 1 Dec 2010 9:47 Modified: 22 Jan 2011 22:50
Reporter: Ole John Aske Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.5.7, 5.1.51, 5.1.54 OS:Any
Assigned to: Ole John Aske CPU Architecture:Any

[1 Dec 2010 9:47] Ole John Aske
Description:
IN-subquery evaluation is incorrect if:

  - The IN-subquery result contain a boolean test 'IS UNKNOWN'
  - Explain shows that 'Full scan on NULL key' may be
    used to execute the subquery.
  - Outer part of the query has OUTER JOINed tables *and* column from these
    outer joined tables are used as outer references from within the subquery.

A more detailed analysis of the failing queries below shows that the incorrect result is related to that the predicate terms containing outer references within the subquery was completely eliminated by the optimizer. - In the testcase below the term 't3.pk2 = t2.i' was never evaluated when executing as 'Full scan on NULL key'. (When executing with REF-access it *is* included as part of the REF-key and is implicitly checked through the key access)

- Has reproduced bug in 5.5 and 5.1, not tested 5.0
- It seems to have been fixed in 5.6.99

How to repeat:
use test;
drop table t1,t2,t3;
create table t1 (i int not null) engine = innodb;
insert into t1 values (1),(2);
create table t2 (i int not null) engine = innodb;
insert into t2 values (3);
create table t3 (pk1 int, pk2 int, primary key(pk1,pk2)) engine = innodb;
insert into t3 values (1,1),(2,2),(3,3);

-- UNKNOWN below should not qualify any rows as :
--  - Outer part of query ('t1 LEFT JOIN t2 ON t2.i = t1.i') will
--    produce only NULL complemented outer rows for t2 as there are 
--    no matches 't2.i = t1.i'.
--  - The inner predicate 'WHERE t3.pk2 = t2.i' should therefore
--    always fail resulting in a an empty IN-subquery.
--  - This effectively reduces the IN predicate to 
--    'NULL IN <empty-list> IS UNKNOWN' which is 'false'
--    (NOTE: For any non-empty value list this is 'true') 

-- All queries below return a non-empty result set where an
-- empty result set is expected
select * from
 t1 LEFT JOIN t2 ON t2.i = t1.i
 WHERE t2.i IN 
 (
   SELECT STRAIGHT_JOIN t3.pk1 FROM t3 JOIN t3 as t4
    ON t4.pk1=t3.pk1
    WHERE t3.pk2 = t2.i
 )
 IS UNKNOWN; 

select * from
 t1 LEFT JOIN t2 ON t2.i = t1.i
 WHERE t2.i IN 
 (
   SELECT t3.pk1 FROM t3 LEFT JOIN t3 as t4
    ON t4.pk1=t3.pk1
    WHERE t3.pk2 = t2.i
 )
 IS UNKNOWN; 

select * from
 t1 LEFT JOIN t2 ON t2.i = t1.i
 WHERE t2.i IN 
 (
   SELECT t3.pk1 FROM t3 JOIN t3 as t4
    ON t4.pk1=t3.pk1
    WHERE t3.pk2 = t2.i
 )
 IS UNKNOWN; 

Suggested fix:
Prevent that make_cond_for_table(), through its usage of part_of_refkey()), removes predicate terms which might be required when we has to do a 'Full scan on NULL key'
[1 Dec 2010 10:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/125629

3477 Ole John Aske	2010-12-01
      Fix for bug#58626, Incorrect result for WHERE <column> IN (<subquery>) IS UNKNOWN.
      
      NOTE: This fix is backported from 5.6.99 which already seems to have fixed this problem.
      
      The fix ensures that if 'Full scan on NULL key' access method may be used
      for a table, we can't assume that any part of the predicate is covered by
      the REF-key. (The join_tab is known to have 'Full scan on NULL key' if
      any cond_guards[] has been defined for 'join_tab->ref')
      
      part_of_refkey() will therefore return '0' if a potential'Full scan on NULL key' 
      is detected - Which will force make_cond_for_table() to include all part of a 
      predicate covering the specified 'tables' and 'used_table' mask.
[1 Dec 2010 10:17] Ole John Aske
Sets Status back to 'open' as it has to be verified by someone.
[1 Dec 2010 10:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/125632

3380 Ole John Aske	2010-12-01
      SPJ-scan-scan: Cherry picked proposed fix for bug#58626 into SPJ branch
      
            NOTE: This fix is backported from 5.6.99 which already seems to have fixed this problem.
      
            The fix ensures that if 'Full scan on NULL key' access method may be used
            for a table, we can't assume that any part of the predicate is covered by
            the REF-key. (The join_tab is known to have 'Full scan on NULL key' if
            any cond_guards[] has been defined for 'join_tab->ref')
      
            part_of_refkey() will therefore return '0' if a potential'Full scan on NULL key'
            is detected - Which will force make_cond_for_table() to include all part of a
            predicate covering the specified 'tables' and 'used_table' mask.
[1 Dec 2010 12:22] Ole John Aske
Still needs to be verified.
[1 Dec 2010 14:42] Valeriy Kravchuk
Verified as described:

macbook-pro:5.1 openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.1.54-debug Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> drop table t1,t2,t3;
ERROR 1051 (42S02): Unknown table 't2'
mysql> create table t1 (i int not null) engine = innodb;
Query OK, 0 rows affected (0.15 sec)

mysql> insert into t1 values (1),(2);
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> create table t2 (i int not null) engine = innodb;
Query OK, 0 rows affected (0.17 sec)

mysql> insert into t2 values (3);
Query OK, 1 row affected (0.00 sec)

mysql> create table t3 (pk1 int, pk2 int, primary key(pk1,pk2)) engine = innodb;Query OK, 0 rows affected (0.12 sec)

mysql> insert into t3 values (1,1),(2,2),(3,3);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from
    ->  t1 LEFT JOIN t2 ON t2.i = t1.i
    ->  WHERE t2.i IN 
    ->  (
    ->    SELECT STRAIGHT_JOIN t3.pk1 FROM t3 JOIN t3 as t4
    ->     ON t4.pk1=t3.pk1
    ->     WHERE t3.pk2 = t2.i
    ->  )
    -> ;
Empty set (0.01 sec)

mysql> select * from
    ->  t1 LEFT JOIN t2 ON t2.i = t1.i
    ->  WHERE t2.i IN 
    ->  (
    ->    SELECT STRAIGHT_JOIN t3.pk1 FROM t3 JOIN t3 as t4
    ->     ON t4.pk1=t3.pk1
    ->     WHERE t3.pk2 = t2.i
    ->  )
    ->  IS UNKNOWN; 
+---+------+
| i | i    |
+---+------+
| 1 | NULL |
| 2 | NULL |
+---+------+
2 rows in set (0.01 sec)
[14 Dec 2010 16:07] Roy Lyseng
Verified that backport from 5.6 fixes this problem.

The original code appears from code merged in by Evgeny on 2010-04-13 with message: Backported post merge fixes for ICP/MRR WL#2475.
In 6.0 code base, this fix was merged in 2007-06-28 by Sergey Petrunia as part of WL#2475.
[14 Jan 2011 14:27] Ole John Aske
This fix is only backported from mysql-trunk to mysql-5.1-telco-7.x branches.
[14 Jan 2011 14:30] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.21 (revid:ole.john.aske@oracle.com-20110114142614-cybdrhkoo5ckvxfi) (version source revid:ole.john.aske@oracle.com-20110114142614-cybdrhkoo5ckvxfi) (merge vers: 5.1.51-ndb-7.0.21) (pib:24)
[22 Jan 2011 22:50] Paul DuBois
Noted in 5.1.51-ndb-7.0.21 changelog.

WHERE conditions of the following form were evaluated incorrectly and
could return incorrect results:

WHERE column IN (subquery) IS UNKNOWN