Bug #24420 (a,b) IN (SELECT a,min(b) FROM t GROUP BY a) can produce wrong results
Submitted: 18 Nov 2006 22:49 Modified: 7 Feb 2007 19:11
Reporter: Igor Babaev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version: OS:
Assigned to: Igor Babaev CPU Architecture:Any

[18 Nov 2006 22:49] Igor Babaev
Description:
(a,b) IN (SELECT a,min(b) FROM t GROUP BY a) can produce wrong results when
column a of table t contains NULL values while column b does not.

This bug manifests itself only with grouping subqueries that return set of row values.

The bug was discovered when reviewing the patch for bug #24127, but it can be demonstrated without that patch and without the patch for bug #8804. 

How to repeat:
mysql> select version();
+--------------+
| version()    |
+--------------+
| 5.0.32-debug |
+--------------+
1 row in set (0.00 sec)

mysql> create table t1 (a int, b int);
Query OK, 0 rows affected (0.06 sec)

mysql> insert into t1 values (0,0), (2,2), (3,3);
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> create table t2 (a int, b int);
Query OK, 0 rows affected (0.07 sec)

mysql> insert into t2 values (1,1), (3,3);
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select a, b, (a,b) in (select a, min(b) from t2 group by a) Z from t1;
+------+------+------+
| a    | b    | Z    |
+------+------+------+
|    0 |    0 |    0 |
|    2 |    2 |    0 |
|    3 |    3 |    1 |
+------+------+------+
3 rows in set (0.01 sec)

mysql> insert into t2 values (NULL,4);
Query OK, 1 row affected (0.00 sec)

mysql> select a, b, (a,b) in (select a, min(b) from t2 group by a) Z from t1;
+------+------+------+
| a    | b    | Z    |
+------+------+------+
|    0 |    0 | NULL |
|    2 |    2 | NULL |
|    3 |    3 |    1 |
+------+------+------+
3 rows in set (0.01 sec)

We see that adding row (NULL,4) to table t2 makes the result of some IN expressions wrong.

If we use a semantically equivalent query without GROUP BY everything is correct:

mysql> select a, b, (a,b) in (select a, b from t2) Z from t1;
+------+------+------+
| a    | b    | Z    |
+------+------+------+
|    0 |    0 |    0 |
|    2 |    2 |    0 |
|    3 |    3 |    1 |
+------+------+------+
3 rows in set (0.00 sec)

If we don't use row value subquery the result is also correct:

mysql> select a, b, b in (select min(b) from t2 group by a) Z from t1;
+------+------+------+
| a    | b    | Z    |
+------+------+------+
|    0 |    0 |    0 |
|    2 |    2 |    0 |
|    3 |    3 |    1 |
+------+------+------+
3 rows in set (0.01 sec)

Suggested fix:
The fix of this bug should be applied after the fix for bug #24127.
[27 Jan 2007 1:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18891

ChangeSet@1.2391, 2007-01-26 17:10:45-08:00, igor@olga.mysql.com +3 -0
  Fixed bug #24420.
  Objects of the classes Item_func_is_not_null_test and Item_func_trig_cond
  must be transparent for the method Item::split_sum_func2 as these classes
  are pure helpers. It means that the method Item::split_sum_func2 should
  look at those objects as at pure wrappers.
[3 Feb 2007 6:06] Igor Babaev
The fix has been pushed into 5.0.36, 5.1.16-beta main trees.
[7 Feb 2007 19:11] Paul DuBois
Noted in 5.0.36, 5.1.16 changelogs.

Expressions of the form (a, b) IN (SELECT a, MIN(b) FROM t GROUP BY
a) could produce incorrect results when column a of table t contains
NULL values while column b does not.