Bug #45191 Nested subqueries, wrong result - duplicate rows
Submitted: 29 May 2009 12:00 Modified: 20 Nov 2010 23:29
Reporter: Matthias Leich Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:6.0 OS:Any
Assigned to: Evgeny Potemkin CPU Architecture:Any
Tags: optimizer_switch, semijoin, subquery

[29 May 2009 12:00] Matthias Leich
Description:
The NIST tests converted to MySQL fail in test dml050
with wrong result sets.
SELECT EMPNUM, EMPNAME
FROM STAFF
WHERE EMPNUM IN
(SELECT EMPNUM  FROM WORKS
WHERE PNUM IN
(SELECT PNUM  FROM PROJ
WHERE PTYPE IN
(SELECT PTYPE  FROM PROJ
WHERE PNUM IN
(SELECT PNUM  FROM WORKS
WHERE EMPNUM IN
(SELECT EMPNUM  FROM WORKS
WHERE PNUM IN
(SELECT PNUM   FROM PROJ
WHERE PTYPE IN
(SELECT PTYPE  FROM PROJ
WHERE CITY IN
(SELECT CITY  FROM STAFF
WHERE EMPNUM IN
(SELECT EMPNUM  FROM WORKS
WHERE HOURS = 20
AND PNUM = 'P2' )))))))));
I get several hundred rows.
NIST expects 4 rows.
The top level table STAFF contains 5 rows.

1. MySQL 5.1 does not show these wrong results.
2. If I
   - drop all involved unique indexes
   - use MyISAM instead of InnoDB
   - reduce the number of nesting levels a bit
     SELECT EMPNUM, EMPNAME
FROM STAFF
WHERE EMPNUM IN
  (SELECT EMPNUM  FROM WORKS
   WHERE PNUM IN
     (SELECT PNUM  FROM PROJ
      WHERE PTYPE IN
        (SELECT PTYPE  FROM PROJ
         WHERE PNUM IN
           (SELECT PNUM  FROM WORKS
            WHERE EMPNUM IN
              (SELECT EMPNUM  FROM WORKS
               WHERE PNUM IN
                 (SELECT PNUM   FROM PROJ))))));
   - SET GLOBAL OPTIMIZER_SWITCH = 'materialization=off,semijoin=off';
     SET GLOBAL optimizer_use_mrr = 'disable';
   than the result sets vary but they stay wrong.

My environment:
- mysql-azalea last modification 2009-05-28
  ./BUILD/compile-pentium64-debug-max
- Linux OpenSuSE 11.0 (64 Bit)

How to repeat:
I will upload test soon.
[4 Jun 2009 13:17] Matthias Leich
test script

Attachment: bug45191.test (application/octet-stream, text), 8.11 KiB.

[4 Jun 2009 13:18] Matthias Leich
expected results

Attachment: bug45191.result (application/octet-stream, text), 3.42 KiB.

[14 Jul 2009 22:15] Patrick Crews
NOTE:  optimizer_switch, engine_condition_pushdown, and optimizer_use_mrr settings do not affect this bug.
[14 Jul 2009 22:20] Patrick Crews
In azalea, the queries provided produce *many* duplicate rows in the result sets - 50+ copies of the valid values returned in 5.1.

Example:
Good result:
Alice
Bill
Cathy

Failing result:
Alice
Alice
Alice
...
Bill
Bill...
[1 Sep 2009 10:23] Evgeny Potemkin
Disabling semijoin, materialization and firstmatch in optimizer_switch hides the bug, so it's a semi-join bug.
[1 Sep 2009 14:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/82150

2863 Evgeny Potemkin	2009-09-01
      Bug#45191: Incorrectly initialized semi-join led to a wrong result.
      
      Current implementation of join cache allows combination with the FirstMatch
      semi-join strategy only when FirstMatch applied only to one table.
      But because of a typo in the FirstMatch initialization code, all tables in
      the semi-join except the first one was incorrectly marked as not belonging
      to the semi-join. This wrongly allowed join cached usage and led to a wrong
      result.
      
      The setup_semijoin_dups_elimination function now correctly marks all tables
      of a semi-join when the FirstMatch strategy is used.
     @ mysql-test/r/subselect_sj.result
        A test case added for the bug#45191.
     @ mysql-test/r/subselect_sj_jcl6.result
        A test case added for the bug#45191.
     @ mysql-test/t/subselect_sj.test
        A test case added for the bug#45191.
     @ sql/sql_select.cc
        Bug#45191: Incorrectly initialized semi-join led to a wrong result.
        
        The setup_semijoin_dups_elimination function now correctly marks all tables
        of a semi-join when the FirstMatch strategy is used.
[25 Sep 2009 7:09] Valeriy Kravchuk
Bug #46072 was marked as a duplicate of this one.
[9 Nov 2009 13:19] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/89804

3700 Evgeny Potemkin	2009-11-09 [merge]
      Auto-merged fix for the bug#45191.
[12 Nov 2009 13:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/90241

3702 Evgeny Potemkin	2009-11-12
      Test cases results corrected after fixing bug#45191.
[12 Nov 2009 14:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/90243

3707 Evgeny Potemkin	2009-11-12
      Test cases results corrected after fixing bug#45191.
[20 Nov 2009 12:56] Bugs System
Pushed into 6.0.14-alpha (revid:kostja@sun.com-20091120124947-yi6h2jbgw0kbciwm) (version source revid:epotemkin@mysql.com-20091112143112-zku0ct0gf5pwly54) (merge vers: 6.0.14-alpha) (pib:13)
[23 Nov 2009 20:20] Paul DuBois
Noted in 6.0.14 changelog.

With semijoin optimization enabled, incorrect FirstMatch
initialization allowed join cache usage and led to incorrect results.
[12 May 2010 12:36] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/108127

3166 Evgeny Potemkin	2010-05-12
      Bug#45191: Incorrectly initialized semi-join led to a wrong result.
      
      Current implementation of join cache allows combination with the FirstMatch
      semi-join strategy only when FirstMatch applied only to one table.
      But because of a typo in the FirstMatch initialization code, all tables in
      the semi-join except the first one was incorrectly marked as not belonging
      to the semi-join. This wrongly allowed join cached usage and led to a wrong
      result.
      
      The setup_semijoin_dups_elimination function now correctly marks all tables
      of a semi-join when the FirstMatch strategy is used.
      Origianl revid: epotemkin@mysql.com-20090901144435-7n2doi4s8yn5hz3l
     @ mysql-test/r/subselect_sj.result
        A test case added for the bug#45191.
     @ mysql-test/r/subselect_sj_jcl6.result
        A test case added for the bug#45191.
     @ mysql-test/t/subselect_sj.test
        A test case added for the bug#45191.
     @ sql/sql_select.cc
        Bug#45191: Incorrectly initialized semi-join led to a wrong result.
        
        The setup_semijoin_dups_elimination function now correctly marks all tables
        of a semi-join when the FirstMatch strategy is used.
[12 May 2010 15:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/108175

3169 Evgeny Potemkin	2010-05-12
      Test cases results corrected after fixing bug#45191.
[16 Aug 2010 6:34] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:23] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[20 Nov 2010 23:29] Paul DuBois
Noted in 5.6.1 changelog.
[23 Nov 2010 2:13] Paul DuBois
Correction: No 5.6.1 changelog entry. Bug does not appear in any released 5.6.x version.