Bug #76685 mysql does not use index for outer query when using subquery
Submitted: 14 Apr 2015 7:53 Modified: 15 Apr 2015 22:52
Reporter: Malte Schirmacher Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.6.22 OS:Linux
Assigned to: MySQL Verification Team CPU Architecture:Any

[14 Apr 2015 7:53] Malte Schirmacher
Description:
We have the following table:

CREATE TABLE `monitoring_check_results` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `timestamp` int(11) NOT NULL,
  `entity_table` varchar(128) NOT NULL,
  `entity_id` int(11) NOT NULL,
  `check_type_id` int(11) NOT NULL,
  `state` tinyint(4) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `timestamp` (`timestamp`),
  KEY `entity_type_timestamp_idx` (`entity_table`,`entity_id`,`check_type_id`,`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

And following query:

SELECT * 
FROM monitoring_check_results
WHERE id
IN (
SELECT MAX( id ) 
FROM monitoring_check_results
WHERE entity_table =  'clients'
AND entity_id = 4711
AND state != -1
GROUP BY check_type_id
)

Unfortunately the outer query does not use the PRIMARY-index as the EXPLAIN query shows:
+----+-------------+--------------------------+------+---------------------------+---------------------------+---------+-------------+---------+------------------------------------+
| id | select_type | table                    | type | possible_keys             | key                       | key_len | ref         | rows    | Extra                              |
+----+-------------+--------------------------+------+---------------------------+---------------------------+---------+-------------+---------+------------------------------------+
|  1 | PRIMARY     | monitoring_check_results | ALL  | NULL                      | NULL                      | NULL    | NULL        | 2698822 | Using where                        |
|  2 | SUBQUERY    | monitoring_check_results | ref  | entity_type_timestamp_idx | entity_type_timestamp_idx | 134     | const,const |     379 | Using index condition; Using where |
+----+-------------+--------------------------+------+---------------------------+---------------------------+---------+-------------+---------+------------------------------------+

Even forcing the use of the index doesn't make Mysql use the index.

If i query some ids manually it will use the index again:

explain select * from monitoring_check_results FORCE INDEX(PRIMARY) where id in (2867414, 3046089, 3046090, 2897199);
+----+-------------+--------------------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table                    | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+--------------------------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | monitoring_check_results | range | PRIMARY       | PRIMARY | 4       | NULL |    4 | Using where |
+----+-------------+--------------------------+-------+---------------+---------+---------+------+------+-------------+

How to repeat:
Create the above mentioned table, fill in some random data and execute the query :-)
[15 Apr 2015 19:32] MySQL Verification Team
9021 is fixed in 5.6 and 5.7 (5.1 and 5.5 affected):

./use_all "
> USE test;
> DROP TABLE IF EXISTS test.table1;
> DROP TABLE IF EXISTS test.table2;
> CREATE TABLE test.table1 (
>   ID int(11) NOT NULL,
>   NAME varchar(10) default NULL,
>   PRIMARY KEY  (ID)
> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
>
> CREATE TABLE test.table2 (
>   ID int(11) NOT NULL,
>   NAME varchar(10) default NULL,
>   PRIMARY KEY  (ID)
> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
>
> INSERT INTO test.table1 VALUES(1,'A');
> INSERT INTO test.table1 VALUES(2,'B');
> INSERT INTO test.table1 VALUES(3,'C');
>
> INSERT INTO test.table2 VALUES(1,'A');
> INSERT INTO test.table2 VALUES(2,'B');
> "
[arhimed@gedora sandboxes]$ ./use_all "
> USE test;
> SELECT @@version;
> EXPLAIN SELECT * FROM table1 WHERE ID IN (SELECT ID FROM table2 WHERE ID=2);
> "
@@version
5.1.73
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY table1  ALL     NULL    NULL    NULL    NULL    3       Using where
2       DEPENDENT SUBQUERY      table2  const   PRIMARY PRIMARY 4       const   1       Using index
@@version
5.5.43
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY table1  ALL     NULL    NULL    NULL    NULL    3       Using where
2       DEPENDENT SUBQUERY      table2  const   PRIMARY PRIMARY 4       const   1       Using index
@@version
5.6.24
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  table1  const   PRIMARY PRIMARY 4       const   1       NULL
1       SIMPLE  table2  const   PRIMARY PRIMARY 4       const   1       Using index
@@version
5.7.7-rc
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    filtered        Extra
1       SIMPLE  table1  NULL    const   PRIMARY PRIMARY 4       const   1       100.00  NULL
1       SIMPLE  table2  NULL    const   PRIMARY PRIMARY 4       const   1       100.00  Using index
[15 Apr 2015 22:07] MySQL Verification Team
on the other hand if the subquery uses aggregate functions then outer query is not using the index

$ ./use_all "
USE test;
SELECT @@version;
EXPLAIN SELECT * FROM table1 WHERE ID IN (SELECT ID FROM table2);
"
@@version
5.1.73
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY table1  ALL     NULL    NULL    NULL    NULL    3       Using where
2       DEPENDENT SUBQUERY      table2  unique_subquery PRIMARY PRIMARY 4       func    1       Using index
@@version
5.5.43
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY table1  ALL     NULL    NULL    NULL    NULL    3       Using where
2       DEPENDENT SUBQUERY      table2  unique_subquery PRIMARY PRIMARY 4       func    1       Using index
@@version
5.6.24
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  table2  index   PRIMARY PRIMARY 4       NULL    2       Using index
1       SIMPLE  table1  eq_ref  PRIMARY PRIMARY 4       test.table2.ID  1       NULL
@@version
5.7.7-rc
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    filtered        Extra
1       SIMPLE  table2  NULL    index   PRIMARY PRIMARY 4       NULL    2       100.00  Using index
1       SIMPLE  table1  NULL    eq_ref  PRIMARY PRIMARY 4       test.table2.ID  1       100.00  NULL

$ ./use_all "
USE test;
SELECT @@version;
EXPLAIN SELECT * FROM table1 WHERE ID IN (SELECT MAX(ID) FROM table2);
"
@@version
5.1.73
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY table1  ALL     NULL    NULL    NULL    NULL    3       Using where
2       DEPENDENT SUBQUERY      NULL    NULL    NULL    NULL    NULL    NULL    NULL    Select tables optimized away
@@version
5.5.43
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY table1  ALL     NULL    NULL    NULL    NULL    3       Using where
2       DEPENDENT SUBQUERY      NULL    NULL    NULL    NULL    NULL    NULL    NULL    Select tables optimized away
@@version
5.6.24
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY table1  ALL     NULL    NULL    NULL    NULL    3       Using where
2       DEPENDENT SUBQUERY      NULL    NULL    NULL    NULL    NULL    NULL    NULL    Select tables optimized away
@@version
5.7.7-rc
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    filtered        Extra
1       PRIMARY table1  NULL    ALL     NULL    NULL    NULL    NULL    3       100.00  Using where
2       DEPENDENT SUBQUERY      NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    Select tables optimized away
[15 Apr 2015 22:50] MySQL Verification Team
Hi Malte,

If you look at the manual:
https://dev.mysql.com/doc/refman/5.6/en/subquery-optimization.html

The subquery optimization in order to perform semi-join transformations requires subquery to :

[quote]
...
* It must be a single SELECT without UNION constructs.

* It must not contain a GROUP BY or HAVING clause or aggregate functions.
...
[/quote]

Now for Subquery Materialization you need to have materialize optimization on and you need to exec (after EXPLAIN EXTENDED) the SHOW WARNINGS in order to see if subquery materialization optimization is going to be used. For your test case you can see it is on.

> select @@version, @@optimizer_switch\G
*************************** 1. row ***************************
         @@version: 5.6.24
@@optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,subquery_materialization_cost_based=on,use_index_extensions=on
1 row in set (0.00 sec)

> EXPLAIN EXTENDED
    -> SELECT * FROM monitoring_check_results WHERE id IN (
    ->   SELECT MAX( id ) FROM monitoring_check_results WHERE entity_table = 'clients' AND entity_id = 4711 AND state != -1 GROUP BY check_type_id
    -> )\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: monitoring_check_results
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 45
     filtered: 100.00
        Extra: Using where
*************************** 2. row ***************************
           id: 2
  select_type: SUBQUERY
        table: monitoring_check_results
         type: ref
possible_keys: entity_type_timestamp_idx
          key: entity_type_timestamp_idx
      key_len: 134
          ref: const,const
         rows: 19
     filtered: 100.00
        Extra: Using index condition; Using where
2 rows in set, 1 warning (0.00 sec)

> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select `test`.`monitoring_check_results`.`id` AS `id`,`test`.`monitoring_check_results`.`timestamp` AS `timestamp`,`test`.`monitoring_check_results`.`entity_table` AS `entity_table`,`test`.`monitoring_check_results`.`entity_id` AS `entity_id`,`test`.`monitoring_check_results`.`check_type_id` AS `check_type_id`,`test`.`monitoring_check_results`.`state` AS `state` from `test`.`monitoring_check_results` where <in_optimizer>(`test`.`monitoring_check_results`.`id`,`test`.`monitoring_check_results`.`id` in ( <materialize> (/* select#2 */ select max(`test`.`monitoring_check_results`.`id`) from `test`.`monitoring_check_results` where ((`test`.`monitoring_check_results`.`entity_id` = 4711) and (`test`.`monitoring_check_results`.`entity_table` = 'clients') and (`test`.`monitoring_check_results`.`state` <> <cache>(-(1)))) group by `test`.`monitoring_check_results`.`check_type_id` having 1 ), <primary_index_lookup>(`test`.`monitoring_check_results`.`id` in <temporary table> on <auto_key> where ((`test`.`monitoring_check_results`.`id` = `materialized-subquery`.`MAX( id )`)))))
1 row in set (0.00 sec)

You see that both subquery type moved from dependant subquery to subquery and that you see that materialized optimization is going to be used.

So I don't see this as a bug, your query is performing as described in the manual

all best
Bogdan