Bug #39069 <row constructor> IN <table-subquery> seriously messed up
Submitted: 27 Aug 2008 11:32 Modified: 11 Feb 2009 3:07
Reporter: Roland Bouman Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S1 (Critical)
Version:5.1.26-rc; 5.0.66a; 6.0.6 OS:Linux (Ubuntu)
Assigned to: Gleb Shchepa CPU Architecture:Any
Tags: regression

[27 Aug 2008 11:32] Roland Bouman
Description:
Almost all expressions of the form

<row-constructor> IN (<query-expression>) return incorrect results. Regardless of the operands the result is always true.

The only exception is where the <row-constructor> has a NULL value in at least one column, in which case the result is NULL (as expected)

This is a serious bug, because the results are clearly wrong.

How to repeat:
SELECT (NULL, NULL) IN (SELECT NULL, NULL)

(Correct: NULL)

mysql> SELECT (1,2) IN (SELECT NULL, NULL);
+------------------------------+
| (1,2) IN (SELECT NULL, NULL) |
+------------------------------+
|                            1 |
+------------------------------+

(WRONG! Should have been NULL....)

mysql> SELECT (1,2) IN (SELECT 1, NULL);
+---------------------------+
| (1,2) IN (SELECT 1, NULL) |
+---------------------------+
|                         1 |
+---------------------------+
1 row in set (0.00 sec)

(WRONG! Should have been NULL....)

mysql> SELECT (1,2) IN (SELECT NULL, 2);
+---------------------------+
| (1,2) IN (SELECT NULL, 2) |
+---------------------------+
|                         1 |
+---------------------------+

(WRONG! Should have been NULL....)

mysql> SELECT (1,2) IN (SELECT NULL, 1);
+---------------------------+
| (1,2) IN (SELECT NULL, 1) |
+---------------------------+
|                         1 |
+---------------------------+

(WRONG! Should have been 0. Rationale is that 2 != 1 so we can be sure (1,2) is not in (null, 1) despite the occurrence of NULL)

mysql> SELECT (1,2) IN (SELECT 1, 1);
+------------------------+
| (1,2) IN (SELECT 1, 1) |
+------------------------+
|                      1 |
+------------------------+
(WRONG! Should have been 0.)

Suggested fix:
Please return correct results
[27 Aug 2008 11:36] Valeriy Kravchuk
Thank you for a bug report. Verified just as described:

mysql> select (1,2) in (select 2,1);
+-----------------------+
| (1,2) in (select 2,1) |
+-----------------------+
|                     1 |
+-----------------------+
1 row in set (0.38 sec)

mysql> select (1,2) in (select 3,1);
+-----------------------+
| (1,2) in (select 3,1) |
+-----------------------+
|                     1 |
+-----------------------+
1 row in set (0.00 sec)

mysql> select (1,2) in (select null, null);
+------------------------------+
| (1,2) in (select null, null) |
+------------------------------+
|                            1 |
+------------------------------+
1 row in set (0.00 sec)

For single value server works as expected:

mysql> select (1) in (select null);
+----------------------+
| (1) in (select null) |
+----------------------+
|                 NULL |
+----------------------+
1 row in set (0.00 sec)

mysql> select (1) in (select 1);
+-------------------+
| (1) in (select 1) |
+-------------------+
|                 1 |
+-------------------+
1 row in set (0.00 sec)

mysql> select (1) in (select 2);
+-------------------+
| (1) in (select 2) |
+-------------------+
|                 0 |
+-------------------+
1 row in set (0.00 sec)

mysql> select version();
+-----------------------+
| version()             |
+-----------------------+
| 6.0.6-alpha-community |
+-----------------------+
1 row in set (0.02 sec)
[27 Aug 2008 11:40] Valeriy Kravchuk
5.1 and 5.0 are also affected:

C:\Program Files\MySQL\MySQL Server 6.0\bin>mysql -uroot -proot -P3310 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.1.26-rc-community-log MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> select (2,2) in (select 2,1);
+-----------------------+
| (2,2) in (select 2,1) |
+-----------------------+
|                     1 |
+-----------------------+
1 row in set (0.05 sec)

mysql> select (2,2) in (select null,null);
+-----------------------------+
| (2,2) in (select null,null) |
+-----------------------------+
|                           1 |
+-----------------------------+
1 row in set (0.00 sec)

mysql> exit
Bye

C:\Program Files\MySQL\MySQL Server 6.0\bin>mysql -uroot -proot -P3308 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.0.66a-enterprise-gpl-nt MySQL Enterprise Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> select (2,2) in (select 2,1);
+-----------------------+
| (2,2) in (select 2,1) |
+-----------------------+
|                     1 |
+-----------------------+
1 row in set (0.08 sec)

mysql> select (2,2) in (select null,null);
+-----------------------------+
| (2,2) in (select null,null) |
+-----------------------------+
|                           1 |
+-----------------------------+
1 row in set (0.00 sec)

mysql> exit
Bye

4.1.22 worked more correctly:

C:\Program Files\MySQL\MySQL Server 6.0\bin>mysql -uroot -proot -P3306 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 4.1.22-community-nt-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> select (2,2) in (select null,null);
+-----------------------------+
| (2,2) in (select null,null) |
+-----------------------------+
|                           1 |
+-----------------------------+
1 row in set (0.00 sec)

Shoukd be NULL actually, but this is correct:

mysql> select (2,2) in (select 2,1);
+-----------------------+
| (2,2) in (select 2,1) |
+-----------------------+
|                     0 |
+-----------------------+
1 row in set (0.00 sec)

mysql> select (2,2) in (select 2,2);
+-----------------------+
| (2,2) in (select 2,2) |
+-----------------------+
|                     1 |
+-----------------------+
1 row in set (0.00 sec)

So, this is a regression of a kind in versions >= 5.0.
[27 Aug 2008 11:53] Roland Bouman
The problem seems to occur only in case the right hand operand includes an expression of the form 

SELECT <literal>, <literal>

Test data:

mysql> SELECT Id, Name From world.City WHERE ID IN (1,2);
+----+----------+
| Id | Name     |
+----+----------+
|  1 | Kabul    |
|  2 | Qandahar |
+----+----------+

Now, for example:

SELECT (SELECT ID, Name FROM world.City WHERE ID = 1) 
    IN (SELECT ID, Name FROM world.City WHERE ID = 2) 

(Correct: returns 0)

SELECT (SELECT 1, 'Kabul')
    IN (SELECT ID, Name FROM world.City WHERE ID = 2);

(Correct: returns 0)

SELECT (SELECT 1, 'Kabul')
    IN (SELECT 2, 'Quandahar');

(WRONG! returns 1, should have been 0)

Interesting tidbit: 

SELECT   (SELECT 1, 'Kabul')      
      IN (SELECT ID, Name FROM world.City WHERE ID = 2 
UNION ALL SELECT 2, 'Qandahar');

(WRONG! returns 1, should have been 0)

(same wrong result when the UNION ALL operands are reversed)
[25 Dec 2008 10:03] Gleb Shchepa
This bug and bug #35311 are duplicates.
[27 Jan 2009 21:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/64206

2725 Gleb Shchepa	2009-01-28
      Bug #39069: <row constructor> IN <table-subquery> seriously messed up
      
      "ROW(...) IN (SELECT ... FROM DUAL)" always returned TRUE.
      
      Item_in_subselect::row_value_transformer rewrites "ROW(...) IN SELECT"
      conditions into the "EXISTS (SELECT ... HAVING ...)" form.
      For a subquery from the DUAL pseudotable resulting HAVING condition
      is an expression on constant values, so further transformation with
      optimize_cond() eliminates this HAVING condition and resets
      JOIN::having to NULL.
      Then JOIN::exec treated that NULL as an always-true-HAVING and that
      caused a bug.
      
      To distinguish an optimized out "HAVING TRUE" clause from 
      "HAVING FALSE" we already have the JOIN::having_value flag. 
      However, JOIN::exec() ignored JOIN::having_value as described abouve 
      as if it always set to COND_TRUE.
      
      The JOIN::exec method has been modified to take into account the value
      of the JOIN::having_value field.
[28 Jan 2009 11:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/64256

2725 Gleb Shchepa	2009-01-28
      Bug #39069: <row constructor> IN <table-subquery> seriously 
                  messed up
      
      "ROW(...) IN (SELECT ... FROM DUAL)" always returned TRUE.
      
      Item_in_subselect::row_value_transformer rewrites "ROW(...) 
      IN SELECT" conditions into the "EXISTS (SELECT ... HAVING ...)" 
      form.
      For a subquery from the DUAL pseudotable resulting HAVING 
      condition is an expression on constant values, so further 
      transformation with optimize_cond() eliminates this HAVING 
      condition and resets JOIN::having to NULL.
      Then JOIN::exec treated that NULL as an always-true-HAVING 
      and that caused a bug.
      
      To distinguish an optimized out "HAVING TRUE" clause from
      "HAVING FALSE" we already have the JOIN::having_value flag.
      However, JOIN::exec() ignored JOIN::having_value as described 
      above as if it always set to COND_TRUE.
      
      The JOIN::exec method has been modified to take into account 
      the value of the JOIN::having_value field.
[28 Jan 2009 19:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/64383

2732 Gleb Shchepa	2009-01-28
      Bug #39069: <row constructor> IN <table-subquery> seriously
                  messed up
      
      "ROW(...) IN (SELECT ... FROM DUAL)" always returned TRUE.
      
      Item_in_subselect::row_value_transformer rewrites "ROW(...)
      IN SELECT" conditions into the "EXISTS (SELECT ... HAVING ...)"
      form.
      For a subquery from the DUAL pseudotable resulting HAVING
      condition is an expression on constant values, so further
      transformation with optimize_cond() eliminates this HAVING
      condition and resets JOIN::having to NULL.
      Then JOIN::exec treated that NULL as an always-true-HAVING
      and that caused a bug.
      
      To distinguish an optimized out "HAVING TRUE" clause from
      "HAVING FALSE" we already have the JOIN::having_value flag.
      However, JOIN::exec() ignored JOIN::having_value as described
      above as if it always set to COND_TRUE.
      
      The JOIN::exec method has been modified to take into account
      the value of the JOIN::having_value field.
[3 Feb 2009 9:11] Bugs System
Pushed into 5.0.78 (revid:joro@sun.com-20090203090422-v91rh3gmx2ulhbu9) (version source revid:gshchepa@mysql.com-20090128184645-ghabfh1x0fz04beg) (merge vers: 5.0.77) (pib:6)
[3 Feb 2009 9:40] Bugs System
Pushed into 5.1.32 (revid:joro@sun.com-20090203090549-gos3v4320vimrzg6) (version source revid:sergefp@mysql.com-20090128192114-jm5r3gezr59oki5x) (merge vers: 5.1.32) (pib:6)
[4 Feb 2009 11:15] Bugs System
Pushed into 6.0.10-alpha (revid:kostja@sun.com-20090204104420-mw1i2u9lum4bxjo6) (version source revid:davi.arnaut@sun.com-20090129184519-mgp2vr1rna9nk8li) (merge vers: 6.0.10-alpha) (pib:6)
[11 Feb 2009 3:07] Paul DuBois
Noted in 5.0.78, 5.1.32, 6.0.10 changelogs.

The expression ROW(...) IN (SELECT ... FROM DUAL) always returned
TRUE.
[17 Feb 2009 14:53] Bugs System
Pushed into 5.1.32-ndb-6.3.23 (revid:tomas.ulin@sun.com-20090217131017-6u8qz1edkjfiobef) (version source revid:tomas.ulin@sun.com-20090203133556-9rclp06ol19bmzs4) (merge vers: 5.1.32-ndb-6.3.22) (pib:6)
[17 Feb 2009 16:41] Bugs System
Pushed into 5.1.32-ndb-6.4.3 (revid:tomas.ulin@sun.com-20090217134419-5ha6xg4dpedrbmau) (version source revid:tomas.ulin@sun.com-20090203133556-9rclp06ol19bmzs4) (merge vers: 5.1.32-ndb-6.3.22) (pib:6)
[17 Feb 2009 18:17] Bugs System
Pushed into 5.1.32-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090217134216-5699eq74ws4oxa0j) (version source revid:tomas.ulin@sun.com-20090202111723-1zzwax187rtls913) (merge vers: 5.1.32-ndb-6.2.17) (pib:6)
[10 Sep 2010 5:18] Timour Katchaounov
This patch for this bug contains wrong result:
Instead of the following result:

+# 2nd and 3rd columns should be same for t1.a == 11 only
+SELECT a AS x, ROW(11, 12) = (SELECT MAX(x), 12), ROW(11, 12) IN (SELECT MAX(x), 12) FROM t1;
+x	ROW(11, 12) = (SELECT MAX(x), 12)	ROW(11, 12) IN (SELECT MAX(x), 12)
+1	0	1
+2	0	1
+11	1	1

The result should be:
1  0 0
2  0 0
11 1 1

Therefore the patch is either wrong or incomplete.
[10 Sep 2010 9:44] Gleb Shchepa
Timour,

Thank you for report.

Yes, the patch doesn't cover aggregate functions (and you cited "regression tests" part of the test suite for aggregates).

Created a new bug #56711 entry for this issue.
[10 Sep 2010 13:16] Gleb Shchepa
Timour,

The problem you pointed is not related to this bug.
The root of the problem is:

mysql> SELECT a AS x, (SELECT MAX(x)) FROM t1;
+------+-----------------+
| x    | (SELECT MAX(x)) |
+------+-----------------+
|    1 |               1 |
|    2 |               2 |
|   11 |              11 |
+------+-----------------+
3 rows in set (0.00 sec)

mysql> SELECT a, (SELECT MAX(a)) FROM t1;
+------+-----------------+
| a    | (SELECT MAX(a)) |
+------+-----------------+
|    1 |              11 |
+------+-----------------+
1 row in set (0.00 sec)

Obviously, only second result is correct.
[10 Sep 2010 13:25] Timour Katchaounov
Well, I mainly pointed out that the previous patch recorded an incorrect result.
It could be a different bug, or an incomplete fix, hard to say.

I don't agree with your conclusion, because
SELECT a AS x, (SELECT MAX(x)) FROM t1;
is like
SELECT a AS x, (SELECT MAX(x) from DUAL) FROM t1;
so 'x' should be bound to each result row of the outer query,
in this case each row of 't1'.

I suggest that you ask Peter Gulutzan about both queries, and the
expected result.