Bug #19895 No optimization of where clause subqueries rewritable as joins
Submitted: 17 May 2006 22:41 Modified: 10 Jan 2013 13:14
Reporter: Michael Del Monte Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0.20, 6.0.7, 6.0.14 OS:Windows (Windows XP)
Assigned to: CPU Architecture:Any

[17 May 2006 22:41] Michael Del Monte
Description:
MySQL fails to optimize WHERE clause subqueries that could be rewritten as inner joins to temp tables of primary keys.  This results in extremely poor performance unless the subquery is explicitly rewritten as a join.

How to repeat:
On any table t having indexed column x, queries of the following type will have very poor performance, and EXPLAIN will show that both primary query and subquery use table scans, even though optimization should (at best) remove the subquery entirely and (at worst) reduce it to a temp table with a ref or eq_ref join:

select * from t where x in (select x from t group by x)
select * from t where x in (select distinct x from t)
select * from t where x in (select x from t)
select * from t where exists (select x from t as t2 where t.x=t2.x)

All of these should reduce to "select * from t"

Rewriting as an explicit join provides nearly the level of desired performance.   Although MySQL still fails to remove the subquery entirely, it does perform the join correctly:

select * from t inner join (select x from t group by x) t2 on t.x=t2.x

Suggested fix:
Improve the optimizer to detect reducible subqueries.
[18 May 2006 5:22] Valeriy Kravchuk
Thank you for a problem report. It is a well-known problem (no subqueries rewrite, bad determination if subqueries are dependent or not etc.). Work in progress. Will be fixed some day.
[20 Oct 2006 9:12] Sveta Smirnova
There is similar bug #23370
[4 Nov 2008 18:48] Valeriy Kravchuk
We still have this problem in 6.0.7 (to some extent):

C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -P3311 -uroot -proot test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 17
Server version: 6.0.7-alpha-community MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> drop table t;
Query OK, 0 rows affected (0.06 sec)

mysql> create table t (id int primary key, x int, key(x)) engine=MyISAM;
Query OK, 0 rows affected (0.05 sec)

mysql> insert into t values (1,1), (2,2), (3,3);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> explain select * from t where x in (select x from t group by x)\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: t
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3
        Extra: Using where
*************************** 2. row ***************************
           id: 2
  select_type: SUBQUERY
        table: t
         type: index
possible_keys: NULL
          key: x
      key_len: 5
          ref: NULL
         rows: 3
        Extra: Using index
2 rows in set (0.05 sec)

mysql> explain select * from t where x in (select x from t)\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: t
         type: index
possible_keys: x
          key: x
      key_len: 5
          ref: NULL
         rows: 3
        Extra: Using index; LooseScan
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: t
         type: ALL
possible_keys: x
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3
        Extra: Using where; Using join buffer
2 rows in set (0.01 sec)

mysql> explain select * from t where exists (select x from t as t2 where t.x=t2.
x)\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: t
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3
        Extra: Using where
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: t2
         type: ref
possible_keys: x
          key: x
      key_len: 5
          ref: test.t.x
         rows: 2
        Extra: Using index
2 rows in set (0.00 sec)
[10 Dec 2009 15:32] Valeriy Kravchuk
Only last case is still NOT optimized any better with 6.0.14 from bzr:

77-52-7-73:6.0-codebase openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 6.0.14-alpha-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> drop table if exists t;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> create table t (id int primary key, x int, key(x)) engine=MyISAM;
Query OK, 0 rows affected (0.08 sec)

mysql> insert into t values (1,1), (2,2), (3,3);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> explain select * from t where x in (select x from t group by x);
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
|  1 | PRIMARY     | t     | ALL   | NULL          | NULL | NULL    | NULL |    3 | Using where |
|  2 | SUBQUERY    | t     | index | NULL          | x    | 5       | NULL |    3 | Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
2 rows in set (0.03 sec)

mysql> explain select * from t where x in (select x from t);
+----+-------------+-------+-------+---------------+------+---------+----------+------+------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref      | rows | Extra                  |
+----+-------------+-------+-------+---------------+------+---------+----------+------+------------------------+
|  1 | PRIMARY     | t     | index | x             | x    | 5       | NULL     |    3 | Using index; LooseScan |
|  1 | PRIMARY     | t     | ref   | x             | x    | 5       | test.t.x |    2 |                        |
+----+-------------+-------+-------+---------------+------+---------+----------+------+------------------------+
2 rows in set (0.00 sec)

mysql> explain select * from t where exists (select x from t as t2 where t.x=t2.x);
+----+--------------------+-------+------+---------------+------+---------+----------+------+-------------+
| id | select_type        | table | type | possible_keys | key  | key_len | ref      | rows | Extra       |
+----+--------------------+-------+------+---------------+------+---------+----------+------+-------------+
|  1 | PRIMARY            | t     | ALL  | NULL          | NULL | NULL    | NULL     |    3 | Using where |
|  2 | DEPENDENT SUBQUERY | t2    | ref  | x             | x    | 5       | test.t.x |    2 | Using index |
+----+--------------------+-------+------+---------------+------+---------+----------+------+-------------+
2 rows in set (0.00 sec)
[10 Jan 2013 13:14] Erlend Dahl
Implemented in 5.6.8.