Bug #51070 Query with a NOT IN subquery predicate returns a wrong result set
Submitted: 10 Feb 2010 20:34 Modified: 7 Oct 2010 23:12
Reporter: Igor Babaev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.1/5.0/6.0/5.5 OS:Any
Assigned to: Martin Hansson CPU Architecture:Any

[10 Feb 2010 20:34] Igor Babaev
Description:
In some cases when the left part of a NOT IN subquery predicate is a row and contains some NULL values the result returned by the query can be wrong:

mysql> select * from t1;
+------+------+
| a1   | a2   |
+------+------+
| zz   | NULL |
| yy   | NULL |
+------+------+
2 rows in set (0.00 sec)

mysql> select * from t2;
+------+------+
| b1   | b2   |
+------+------+
| NULL | xx   |
| NULL | tt   |
+------+------+
2 rows in set (0.00 sec)

mysql> select * from t1 where (a1, a2) NOT IN (select b1, b2 from t2);
+------+------+
| a1   | a2   |
+------+------+
| zz   | NULL |
| yy   | NULL |
+------+------+
2 rows in set (0.00 sec)

The expected result here is an empty set.

How to repeat:
Create and populate tables t1 and t2 with the following commands:

create table t1 (a1 char(2), a2 char(2));
create table t2 (b1 char(2), b2 char(2));
insert into t1 (a1, a2) values ('zz', NULL);
insert into t1 (a1, a2) values ('yy', NULL);
insert into t2 (b1, b2) values (NULL, 'xx');
insert into t2 (b1, b2) values (NULL, 'tt');

Run the query:

select * from t1 where (a1, a2) NOT IN (select b1, b2 from t2);
[10 Feb 2010 22:15] MySQL Verification Team
Thank you for the bug report. Verified as described on 5.1/5.0/6.0 however 5.5.XX gives different result :):

C:\DBS>c:\dbs\5.5\bin\mysql -uroot --port=3540 --prompt="mysql 5.5 >"
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.99-m3-Win X64 Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.5 >use test
Database changed
mysql 5.5 >select * from t1 where (a1, a2) NOT IN (select b1, b2 from t2);
ERROR 1054 (42S22): Unknown column 'a1' in 'IN/ALL/ANY subquery'
[12 Apr 2010 15:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/105436

3451 Martin Hansson	2010-04-12
      Bug#51070: Query with a NOT IN subquery predicate returns a wrong result set
      
      The parser rule for subqueries would parse (being a recursive ascent parser)
      first the IN <subquery> rule, and later apply NOT for a NOT IN subquery. When
      applying the NOT, the subquery was wrapped inside a NOT node. By the time the
      parser reached the top level WHERE clase it informed the subquery that it was
      on the top level, but the NOT node did not forward this notification to the IN
      <subquery> node, causing it to behave wrongfully wrt to row comparions.
      
      Fixed by making a NOT node forward the notificitation to its child node.
      
      Some test results changed, the new results are the correct ones.
[29 Jun 2010 9:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/112431

3452 Martin Hansson	2010-06-29
      Bug#51070: Query with a NOT IN subquery predicate returns a wrong result set 
      
      The EXISTS transformation has additional swithces to catch most corner cases
      that appear when translating an IN predicate (into EXISTS). Guarded conditions
      are used which are turned of when a NULL value is seen in the outer
      expression's row. When the inner query supplies NULL values, however, they are
      filtered out because no distinction is made between the guarded conditions;
      guarded NOT x IS NULL conditions in the HAVING clause that filter out NULL
      values cannot be de-activated in isolation from those that match values or
      from the outer expression or NULL's.
      
      The fix separates the WHERE condition's guarded predicates - 'search condition
      guards' - from the HAVING condition's guarded predicated - 'filtering
      condition guards'. When a NULL value is seen in the row from the outer query's
      table, both guards for the corresponding column in the inner query's table are
      inactivated, just as before. But when there is a non-NULL value in the outer
      row, the filtering guard is deactivated for the corresponding inner
      column. The class Condition_guard is created to keep track of the two sets of
      predicates.
[29 Jun 2010 9:51] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/112432

3453 Martin Hansson	2010-06-29
      Bug#51070: Query with a NOT IN subquery predicate returns a wrong result set 
      
      The EXISTS transformation has additional swithces to catch most corner cases
      that appear when translating an IN predicate (into EXISTS). Guarded conditions
      are used which are turned of when a NULL value is seen in the outer
      expression's row. When the inner query supplies NULL values, however, they are
      filtered out because no distinction is made between the guarded conditions;
      guarded NOT x IS NULL conditions in the HAVING clause that filter out NULL
      values cannot be de-activated in isolation from those that match values or
      from the outer expression or NULL's.
      
      The fix separates the WHERE condition's guarded predicates - 'search condition
      guards' - from the HAVING condition's guarded predicated - 'filtering
      condition guards'. When a NULL value is seen in the row from the outer query's
      table, both guards for the corresponding column in the inner query's table are
      inactivated, just as before. But when there is a non-NULL value in the outer
      row, the filtering guard is deactivated for the corresponding inner
      column. The class Condition_guard is created to keep track of the two sets of
      predicates.
[29 Jun 2010 10:04] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/112434

3453 Martin Hansson	2010-06-29
      Bug#51070: Query with a NOT IN subquery predicate returns a wrong result set 
      
      The EXISTS transformation has additional swithces to catch most corner cases
      that appear when translating an IN predicate (into EXISTS). Guarded conditions
      are used which are turned of when a NULL value is seen in the outer
      expression's row. When the inner query supplies NULL values, however, they are
      filtered out because no distinction is made between the guarded conditions;
      guarded NOT x IS NULL conditions in the HAVING clause that filter out NULL
      values cannot be de-activated in isolation from those that match values or
      from the outer expression or NULL's.
      
      The fix separates the WHERE condition's guarded predicates - 'search condition
      guards' - from the HAVING condition's guarded predicated - 'filtering
      condition guards'. When a NULL value is seen in the row from the outer query's
      table, both guards for the corresponding column in the inner query's table are
      inactivated, just as before. But when there is a non-NULL value in the outer
      row, the filtering guard is deactivated for the corresponding inner
      column. The class Condition_guard is created to keep track of the two sets of
      predicates.
[6 Aug 2010 13:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/115205

3478 Martin Hansson	2010-08-06
      Bug#51070: Query with a NOT IN subquery predicate returns a wrong result set
      
      The EXISTS transformation has additional swithces to catch most corner cases
      that appear when translating an IN predicate (into EXISTS). Guarded conditions
      are used which are deactivated when a NULL value is seen in the outer
      expression's row. When the inner query block supplies NULL values, however,
      they are filtered out because no distinction is made between the guarded
      conditions; guarded NOT x IS NULL conditions in the HAVING clause that filter
      out NULL values cannot be de-activated in isolation from those that match
      values or from the outer expression or NULL's.
      
      The above problem is handled by making the guarded conditions remember whether
      they have rejected a NULL value or not, and index access methods are taking
      this into account as well. 
      
      The bug consisted of 
      
      1) Not resetting the property for every nested loop iteration on the inner
        query's result
      
      2) Not propagating the NULL result properly from inner query to IN optimizer.
      
      3) A hack that aimed to fix #2 by returning NULL when false was actually
         the result. This caused failures when #2 was properly fixed.
[8 Aug 2010 1:00] Igor Babaev
Martin,

Would not it be more natural just to replace
- null_value= !item_subs->engine->no_rows();
with 
+ null_value= !item_subs->engine->no_rows() || was null;

and to remove all other clatter from the patch as not belonging to the fix of the bug?
(Missing initializations of was_null probably should be left though).

Regards,
Igor.
[30 Aug 2010 9:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/117125

3496 Martin Hansson	2010-08-30
      Bug#51070: Query with a NOT IN subquery predicate returns a wrong result set
      
      The EXISTS transformation has additional swithces to catch most corner cases
      that appear when transforming an IN predicate into EXISTS. Guarded conditions
      are used which are deactivated when a NULL value is seen in the outer
      expression's row. When the inner query block supplies NULL values, however,
      they are filtered out because no distinction is made between the guarded
      conditions; guarded NOT x IS NULL conditions in the HAVING clause that filter
      out NULL values cannot be de-activated in isolation from those that match
      values or from the outer expression or NULL's.
      
      The above problem is handled by making the guarded conditions remember whether
      they have rejected a NULL value or not, and index access methods are taking
      this into account as well. 
      
      The bug consisted of 
      
      1) Not resetting the property for every nested loop iteration on the inner
         query's result
      
      2) Not propagating the NULL result properly from inner query to IN optimizer.
      
      3) A hack that may or may not have been needed at some point. According to a
         comment it was aimed to fix #2 by returning NULL when FALSE was actually
         the result. This caused failures when #2 was properly fixed. The hack is
         now removed.
[31 Aug 2010 9:37] Guilhem Bichot
approved with minor comments sent by mail
[2 Sep 2010 12:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/117415

3498 Martin Hansson	2010-09-02
      Bug#51070: Query with a NOT IN subquery predicate returns a wrong result set
            
      The EXISTS transformation has additional switches to catch the known corner
      cases that appear when transforming an IN predicate into EXISTS. Guarded
      conditions are used which are deactivated when a NULL value is seen in the
      outer expression's row. When the inner query block supplies NULL values,
      however, they are filtered out because no distinction is made between the
      guarded conditions; guarded NOT x IS NULL conditions in the HAVING clause that
      filter out NULL values cannot be de-activated in isolation from those that
      match values or from the outer expression or NULL's.
      
      The above problem is handled by making the guarded conditions remember whether
      they have rejected a NULL value or not, and index access methods are taking
      this into account as well. 
      
      The bug consisted of 
      
      1) Not resetting the property for every nested loop iteration on the inner
         query's result.
      
      2) Not propagating the NULL result properly from inner query to IN optimizer.
      
      3) A hack that may or may not have been needed at some point. According to a
         comment it was aimed to fix #2 by returning NULL when FALSE was actually
         the result. This caused failures when #2 was properly fixed. The hack is
         now removed.
      
      The fix resolves all three points.
[7 Sep 2010 9:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/117686

3502 Martin Hansson	2010-09-07
      Bug#51070: Query with a NOT IN subquery predicate returns a wrong result set
            
      The EXISTS transformation has additional switches to catch the known corner
      cases that appear when transforming an IN predicate into EXISTS. Guarded
      conditions are used which are deactivated when a NULL value is seen in the
      outer expression's row. When the inner query block supplies NULL values,
      however, they are filtered out because no distinction is made between the
      guarded conditions; guarded NOT x IS NULL conditions in the HAVING clause that
      filter out NULL values cannot be de-activated in isolation from those that
      match values or from the outer expression or NULL's.
      
      The above problem is handled by making the guarded conditions remember whether
      they have rejected a NULL value or not, and index access methods are taking
      this into account as well. 
      
      The bug consisted of 
      
      1) Not resetting the property for every nested loop iteration on the inner
         query's result.
      
      2) Not propagating the NULL result properly from inner query to IN optimizer.
      
      3) A hack that may or may not have been needed at some point. According to a
         comment it was aimed to fix #2 by returning NULL when FALSE was actually
         the result. This caused failures when #2 was properly fixed. The hack is
         now removed.
      
      The fix resolves all three points.
[28 Sep 2010 15:40] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@sun.com-20100928153607-tdsxkdm5cmuym5sq) (version source revid:alik@sun.com-20100928153508-0saa6v93dinqx1u7) (merge vers: 5.6.1-m4) (pib:21)
[28 Sep 2010 15:43] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100928153646-pqp8o1a92mxtuj3h) (version source revid:alik@sun.com-20100928153532-lr3gtvnyp2en4y75) (pib:21)
[28 Sep 2010 15:45] Bugs System
Pushed into mysql-5.5 5.5.7-rc (revid:alik@sun.com-20100928153459-4nudf4zgzlou4s7q) (version source revid:alik@sun.com-20100928153459-4nudf4zgzlou4s7q) (merge vers: 5.5.7-rc) (pib:21)
[29 Sep 2010 19:40] Paul DuBois
Noted in 5.5.7, 5.6.1 changelogs.

In some cases, when the left part of a NOT IN subquery predicate was
a row and contained NULL values, the query result was incorrect.
[7 Oct 2010 23:12] Paul DuBois
Noted in 5.1.52 changelog.
[1 Nov 2010 19:02] Bugs System
Pushed into mysql-5.1 5.1.53 (revid:build@mysql.com-20101101184443-o2olipi8vkaxzsqk) (version source revid:build@mysql.com-20101101184443-o2olipi8vkaxzsqk) (merge vers: 5.1.53) (pib:21)
[23 Nov 2010 7:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/124704

3376 Ole John Aske	2010-11-23
      SPJ-scan-scan: Cherry picked fix for bug#51070 into branch 'mysql-5.1-telco-7.0-spj-scan-scan'
      
      Showstopper for extensive RQG testing of SPJ