Bug #48623 Multiple subqueries are optimized incorrectly
Submitted: 9 Nov 2009 7:59 Modified: 22 Nov 2010 0:38
Reporter: Roy Lyseng Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:6.0 OS:Any
Assigned to: Roy Lyseng CPU Architecture:Any
Tags: optimizer_switch, semijoin, subquery

[9 Nov 2009 7:59] Roy Lyseng
Description:
When reviewing the fix for bug#45191, I noticed some suspicious-looking code in semijoin optimization, in routine setup_semijoin_dups_elimination().
After processing a semijoin strategy, the routine looks for a new semijoin. But the increment to the next table in the JOIN object seems to be wrong, so the next subquery will not be treated properly.

The following is the explain output from the repeat case:

+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
|  1 | PRIMARY     | w1    | ALL  | NULL          | NULL | NULL    | NULL |    5 |                                |
|  1 | PRIMARY     | w2    | ALL  | NULL          | NULL | NULL    | NULL |    5 | Using where; FirstMatch(w1)    |
|  1 | PRIMARY     | w3    | ALL  | NULL          | NULL | NULL    | NULL |    5 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
3 rows in set (1.82 sec)

The following is explain output when the "Suggested fix" modification is applied:

+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                       |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------+
|  1 | PRIMARY     | w1    | ALL  | NULL          | NULL | NULL    | NULL |    5 |                             |
|  1 | PRIMARY     | w2    | ALL  | NULL          | NULL | NULL    | NULL |    5 | Using where; FirstMatch(w1) |
|  1 | PRIMARY     | w3    | ALL  | NULL          | NULL | NULL    | NULL |    5 | Using where; FirstMatch(w2) |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------+
3 rows in set (36.94 sec)

How to repeat:
Use the following test case from subselect.test to notice the problem (explain has been added to the select statement):

# Item_cond fix field
#
create table t1(val varchar(10));
insert into t1 values ('aaa'), ('bbb'),('eee'),('mmm'),('ppp');
explain select count(*) from t1 as w1 where w1.val in (select w2.val from t1 as w2 where w2.val like 'm%') and w1.val in (select w3.val from t1 as w3 where w3.val like 'e%');
drop table t1;

Suggested fix:
Outline of fix:

=== modified file 'sql/sql_select.cc'
--- sql/sql_select.cc	2009-10-25 13:41:27 +0000
+++ sql/sql_select.cc	2009-10-28 16:49:09 +0000
@@ -1211,7 +1211,7 @@ int setup_semijoin_dups_elimination(JOIN
       case SJ_OPT_MATERIALIZE:
       case SJ_OPT_MATERIALIZE_SCAN:
         /* Do nothing */
-        i += pos->n_sj_tables;
+        i+= (pos->n_sj_tables-1);
         break;
       case SJ_OPT_LOOSE_SCAN:
       {
@@ -1227,7 +1227,7 @@ int setup_semijoin_dups_elimination(JOIN
         tab->loosescan_key_len= keylen;
         if (pos->n_sj_tables > 1) 
           tab[pos->n_sj_tables - 1].do_firstmatch= tab;
-        i += pos->n_sj_tables;
+        i+= (pos->n_sj_tables-1);
         break;
       }
       case SJ_OPT_DUPS_WEEDOUT:
@@ -1310,7 +1310,7 @@ int setup_semijoin_dups_elimination(JOIN
         join->join_tab[first_table].flush_weedout_table= sjtbl;
         join->join_tab[i + pos->n_sj_tables - 1].check_weed_out_table= sjtbl;
 
-        i += pos->n_sj_tables;
+        i+= (pos->n_sj_tables-1);
         break;
       }
       case SJ_OPT_FIRST_MATCH:
@@ -1327,7 +1327,7 @@ int setup_semijoin_dups_elimination(JOIN
           }
         }
         j[-1].do_firstmatch= jump_to;
-        i += pos->n_sj_tables;
+        i+= (pos->n_sj_tables-1);
         break;
       }
       case SJ_OPT_NONE:
[14 Dec 2009 15:20] MySQL Verification Team
Thank you for the bug report. Verified as described.
[22 Feb 2010 10:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/101036

2943 Roy Lyseng	2010-02-22
      Bug#48623: Multiple subqueries are optimized incorrectly
      
      The function setup_semijoin_dups_elimination() has a major loop that
      goes through every table in the JOIN object. Usually, there is a normal
      "plus one" increment in the for loop that implements this, but each semijoin
      nest is treated as one entity and there is another increment that skips past
      the semijoin nest to the next table in the JOIN object. However, when
      combining these two increments, the next joined table is skipped, and if that
      happens to be the start of another semijoin nest, the correct processing
      for that nest will not be carried out.
      
        mysql-test/r/subselect_sj.result
          Added test results for bug#48623 
        mysql-test/r/subselect_sj_jcl6.result
          Added test results for bug#48623
        mysql-test/t/subselect_sj.test
          Added test case for bug#48623
        sql/sql_select.cc
          Omitted the "plus one" increment in the for loop, added "plus one"
          in the remaining switch case, fixed coding style issue in remaining
          increment operations.
[22 Feb 2010 14:05] Guilhem Bichot
approved with comments sent by mail.
[2 Mar 2010 15:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/102065

2988 Roy Lyseng	2010-03-02
      Bug#48623: Multiple subqueries are optimized incorrectly
      
      The function setup_semijoin_dups_elimination() has a major loop that
      goes through every table in the JOIN object. Usually, there is a normal
      "plus one" increment in the for loop that implements this, but each semijoin
      nest is treated as one entity and there is another increment that skips past
      the semijoin nest to the next table in the JOIN object. However, when
      combining these two increments, the next joined table is skipped, and if that
      happens to be the start of another semijoin nest, the correct processing
      for that nest will not be carried out.
            
        mysql-test/r/subselect_sj.result
          Added test results for bug#48623 
        mysql-test/r/subselect_sj_jcl6.result
          Added test results for bug#48623
        mysql-test/t/subselect_sj.test
          Added test case for bug#48623
        sql/sql_select.cc
          Omitted the "plus one" increment in the for loop, added "plus one"
          in the remaining switch case, fixed coding style issue in remaining
          increment operations.
[8 Mar 2010 9:19] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/102543

3798 Roy Lyseng	2010-03-08
      Bug#48623: Multiple subqueries are optimized incorrectly
            
      The function setup_semijoin_dups_elimination() has a major loop that
      goes through every table in the JOIN object. Usually, there is a normal
      "plus one" increment in the for loop that implements this, but each semijoin
      nest is treated as one entity and there is another increment that skips past
      the semijoin nest to the next table in the JOIN object. However, when
      combining these two increments, the next joined table is skipped, and if that
      happens to be the start of another semijoin nest, the correct processing
      for that nest will not be carried out.
                  
      mysql-test/r/subselect_sj.result
        Added test results for bug#48623 
      mysql-test/r/subselect_sj_jcl6.result
        Added test results for bug#48623
      mysql-test/t/subselect_sj.test
        Added test case for bug#48623
      sql/sql_select.cc
        Omitted the "plus one" increment in the for loop, added "plus one"
        in the remaining switch case, fixed coding style issue in remaining
        increment operations.
[10 Mar 2010 13:37] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100310133305-0jdlngbtrkoqzckh) (version source revid:alik@sun.com-20100310132404-uqarl0o0vlra2kjx) (merge vers: 6.0.14-alpha) (pib:16)
[13 Mar 2010 23:38] Paul DuBois
Noted in 6.0.14 changelog.

With semijoin optimization enabled, incorrect processing of semijoin
nests resulted in incorrect query results.
[7 May 2010 11:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107732

3146 Roy Lyseng	2010-05-07
      Bug#48623: Multiple subqueries are optimized incorrectly
                  
      The function setup_semijoin_dups_elimination() has a major loop that
      goes through every table in the JOIN object. Usually, there is a normal
      "plus one" increment in the for loop that implements this, but each semijoin
      nest is treated as one entity and there is another increment that skips past
      the semijoin nest to the next table in the JOIN object. However, when
      combining these two increments, the next joined table is skipped, and if that
      happens to be the start of another semijoin nest, the correct processing
      for that nest will not be carried out.
                        
      mysql-test/r/subselect_sj.result
        Added test results for bug#48623 
      mysql-test/r/subselect_sj_jcl6.result
        Added test results for bug#48623
      mysql-test/t/subselect_sj.test
        Added test case for bug#48623
      sql/sql_select.cc
        Omitted the "plus one" increment in the for loop, added "plus one"
        in the remaining switch case, fixed coding style issue in remaining
        increment operations.
      
      revid:wlad@sun.com-20100308002642-oy9pm53isc8btdrf..roy.lyseng@sun.com-20100308091901-vyfz5hap2ic743py
[16 Aug 2010 6:39] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:19] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[22 Nov 2010 0:38] Paul DuBois
Noted in 5.6.1 changelog.
[23 Nov 2010 2:17] Paul DuBois
Correction: No 5.6.1 changelog entry. Bug does not appear in any released 5.6.x version.