Bug #35032 Wrong execution plan for uncorrelated subquery when semijoin is applicable
Submitted: 4 Mar 2008 11:44 Modified: 3 May 2008 3:58
Reporter: Alexey Stroganov Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:6.0.4 OS:Any
Assigned to: Assigned Account CPU Architecture:Any

[4 Mar 2008 11:44] Alexey Stroganov
Description:
Execution of following subquery should produce the similar plan as equivalent join query but it looks like something wrong happens during processing of semi-joins and we got different table order for this subquery.

explain extended  select count(id) from ot where ot.expr_key IN (select it.expr_key from it where it.id < 5000) and ot.id<300000;
+----+-------------+-------+-------+------------------+----------+---------+-------------------+--------+----------+----------------------------------+
| id | select_type | table | type  | possible_keys    | key      | key_len | ref               | rows   | filtered | Extra                            |
+----+-------------+-------+-------+------------------+----------+---------+-------------------+--------+----------+----------------------------------+
|  1 | PRIMARY     | ot    | range | PRIMARY,expr_key | PRIMARY  | 4       | NULL              | 302167 |   100.00 | Using index condition; Using MRR |
|  1 | PRIMARY     | it    | ref   | PRIMARY,expr_key | expr_key | 4       | subq2.ot.expr_key |      5 |   100.00 | Using where; FirstMatch(ot)      |
+----+-------------+-------+-------+------------------+----------+---------+-------------------+--------+----------+----------------------------------+
2 rows in set, 1 warning (0.00 sec)

mysql>  explain extended select count(ot.id) from ot,it where ot.expr_key = it.expr_key  and it.id < 5000 and ot.id<300000;
+----+-------------+-------+-------+------------------+----------+---------+-------------------+------+----------+----------------------------------+
| id | select_type | table | type  | possible_keys    | key      | key_len | ref               | rows | filtered | Extra                            |
+----+-------------+-------+-------+------------------+----------+---------+-------------------+------+----------+----------------------------------+
|  1 | SIMPLE      | it    | range | PRIMARY,expr_key | PRIMARY  | 4       | NULL              | 5036 |   100.00 | Using index condition; Using MRR |
|  1 | SIMPLE      | ot    | ref   | PRIMARY,expr_key | expr_key | 4       | subq2.it.expr_key |    1 |   100.00 | Using where                      |
+----+-------------+-------+-------+------------------+----------+---------+-------------------+------+----------+----------------------------------+
2 rows in set, 1 warning (0.00 sec)

How to repeat:
 - run attached generator/loader script that will generate and load necessary data

perl subquery-datagen.pl --generate --ot-rows=10000000 --load-data --database=test --user=root --socket=/tmp/mysql.sock

 - execute subquery:  explain extended  select count(id) from ot where ot.expr_key IN (select it.expr_key from it where it.id < 5000) and ot.id<300000;

 - execute join: explain extended select count(ot.id) from ot,it where ot.expr_key = it.expr_key  and it.id < 5000 and ot.id<300000;
[4 Mar 2008 11:51] Alexey Stroganov
Datagen/loader tool for test case

Attachment: subquery-datagen.pl (application/x-perl, text), 7.97 KiB.

[4 Mar 2008 12:12] Alexey Stroganov
just found out that for smaller dataset order of tables looks correct:

perl subquery-datagen.pl --generate --ot-rows=100000 --load-data --database=test
--user=test --socket=/tmp/mysql.sock

mysql> explain select count(id) from ot where ot.expr_key IN (select it.expr_key from it where it.id < 5000) and ot.id<300000;
+----+-------------+-------+------+------------------+----------+---------+-------------------+--------+------------------------------+
| id | select_type | table | type | possible_keys    | key      | key_len | ref               | rows   | Extra                        |
+----+-------------+-------+------+------------------+----------+---------+-------------------+--------+------------------------------+
|  1 | PRIMARY     | it    | ALL  | PRIMARY,expr_key | NULL     | NULL    | NULL              | 500000 | Using where; Start temporary |
|  1 | PRIMARY     | ot    | ref  | PRIMARY,expr_key | expr_key | 4       | subq3.it.expr_key |      1 | Using where; End temporary   |
+----+-------------+-------+------+------------------+----------+---------+-------------------+--------+------------------------------+
2 rows in set (0.00 sec)

mysql> explain select count(ot.id) from ot,it where ot.expr_key = it.expr_key  and it.id < 5000 and ot.id<300000;
+----+-------------+-------+-------+------------------+----------+---------+-------------------+------+----------------------------------+
| id | select_type | table | type  | possible_keys    | key      | key_len | ref               | rows | Extra                            |
+----+-------------+-------+-------+------------------+----------+---------+-------------------+------+----------------------------------+
|  1 | SIMPLE      | it    | range | PRIMARY,expr_key | PRIMARY  | 4       | NULL              | 5123 | Using index condition; Using MRR |
|  1 | SIMPLE      | ot    | ref   | PRIMARY,expr_key | expr_key | 4       | subq3.it.expr_key |    1 | Using where                      |
+----+-------------+-------+-------+------------------+----------+---------+-------------------+------+----------------------------------+
2 rows in set (0.00 sec)
[18 Mar 2008 10:34] Alexey Stroganov
Updated version of datagen script

Attachment: subquery-datagen.pl (application/x-perl, text), 7.99 KiB.

[27 Mar 2008 18:10] Susanne Ebrecht
Verified as described with:
6.0.5-alpha-debug

I only used 10^6 rows at the perl test because with 10^7 rows I just got the message: "Not enough disk space". But it occurs here also with 10^6 rows.
[2 May 2008 4:51] Sergey Petrunya
With the latest mysql-6.0-opt, tip cset 

ChangeSet@1.2633, 2008-05-02 02:41:35+04:00, sergefp@mysql.com +3 -0

(where we had a lot of subquery bugs fixed), I get 

mysql> explain  select count(ot.id) from ot,it where ot.expr_key = it.expr_key  and it.id < 5000 and ot.id<300000\G

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: it
         type: range
possible_keys: PRIMARY,expr_key
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 5036
        Extra: Using index condition; Using MRR
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: ot
         type: ref
possible_keys: PRIMARY,expr_key
          key: expr_key
      key_len: 4
          ref: bug35032.it.expr_key
         rows: 1
        Extra: Using where
2 rows in set (0.00 sec)

It seems to be fixed by the recent subquery fixes. Feel free to reopen if it shows up again on a recent tree.
[2 May 2008 4:57] Sergey Petrunya
Sorry, the last comment didn't make much sense, try again:

With the latest mysql-6.0-opt, tip cset 

ChangeSet@1.2633, 2008-05-02 02:41:35+04:00, sergefp@mysql.com +3 -0

(where we had a lot of subquery bugs fixed), I get 

mysql> explain extended  select count(id) from ot where ot.expr_key IN (select it.expr_key from it where it.id < 5000) and ot.id<300000\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: it
         type: range
possible_keys: PRIMARY,expr_key
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 5036
     filtered: 75.00
        Extra: Using index condition; Using MRR; Start temporary
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: ot
         type: ref
possible_keys: PRIMARY,expr_key
          key: expr_key
      key_len: 4
          ref: bug35032.it.expr_key
         rows: 1
     filtered: 100.00
        Extra: Using where; End temporary
2 rows in set, 1 warning (0.00 sec)
[3 May 2008 3:58] Sergey Petrunya
Fixed by fix for BUG#35674. Changing status to Duplicate.