Bug #58660 Index not used with a subquery
Submitted: 2 Dec 2010 14:46 Modified: 26 Mar 2012 19:32
Reporter: Michael Skulsky Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0.91, 5.1.39, 5.0, 5.1, 5.6.1 OS:Linux
Assigned to: CPU Architecture:Any
Triage: Triaged: D3 (Medium)

[2 Dec 2010 14:46] Michael Skulsky
Description:
In the example below fk_index on tbl2is used, but fk_index on tbl1 is not.

How to repeat:
drop table if exists tbl1;
drop table if exists tbl2;

CREATE TABLE `tbl1` (
  `id` bigint(20) NOT NULL DEFAULT '0',
  `dummy` int,
  `fk` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `fk_index` (`fk`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ;

CREATE TABLE `tbl2` (
  `id` bigint(20) NOT NULL DEFAULT '0',
  `fk` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `fk_index` (`fk`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ;

CREATE TABLE `tbl3` (
  `id` bigint(20) NOT NULL DEFAULT '0',
  `fk` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `fk_index` (`fk`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ;

mysql> explain select * from tbl1 where tbl1.fk in (select id from tbl3 where tbl3.fk = 104027);
+----+--------------------+-------+-----------------+------------------+---------+---------+------+------+-------------+
| id | select_type        | table | type            | possible_keys    | key     | key_len | ref  | rows | Extra       |
+----+--------------------+-------+-----------------+------------------+---------+---------+------+------+-------------+
|  1 | PRIMARY            | tbl1  | ALL             | NULL             | NULL    | NULL    | NULL |    1 | Using where |
|  2 | DEPENDENT SUBQUERY | tbl3  | unique_subquery | PRIMARY,fk_index | PRIMARY | 8       | func |    1 | Using where |
+----+--------------------+-------+-----------------+------------------+---------+---------+------+------+-------------+
2 rows in set (0.00 sec)

mysql> explain select * from tbl2 where tbl2.fk in (select id from tbl3 where tbl3.fk = 104027);
+----+--------------------+-------+-----------------+------------------+----------+---------+------+------+--------------------------+
| id | select_type        | table | type            | possible_keys    | key      | key_len | ref  | rows | Extra                    |
+----+--------------------+-------+-----------------+------------------+----------+---------+------+------+--------------------------+
|  1 | PRIMARY            | tbl2  | index           | NULL             | fk_index | 9       | NULL |    1 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | tbl3  | unique_subquery | PRIMARY,fk_index | PRIMARY  | 8       | func |    1 | Using where              |
+----+--------------------+-------+-----------------+------------------+----------+---------+------+------+--------------------------+
2 rows in set (0.00 sec)
[2 Dec 2010 14:56] Valeriy Kravchuk
Please, provide dump of (minimal set of) data for every table used to demonstrate the problem.
[2 Dec 2010 14:59] Michael Skulsky
I can reproduce it with empty tables. The example I am giving is of course artificial.
With large tables in production where the problem was found behaviour is the same.
[2 Dec 2010 15:18] Valeriy Kravchuk
With empty tables all execution plans are more or less the same. Scanning index or entire table makes no real difference in this case.

Inconsistency is one problem, but plan that is less efficient than some other possible plan (by wallclock time of execution) is another problem, more serious. So, please, provide a complete test case that demonstrates the problem.
[2 Dec 2010 20:24] Michael Skulsky
These execution plans are NOT the same. The fk_index in tbl1 is not even listed in possible keys, so it will never be selected, no matter what data is in these tables. 
In production I have about 1 million of rows in tbl1, the subquery returns a matter of ~100 rows, the overall query result is no more than several rows, often empty. If the query is rewritten using inner join then the index is used and the query executes momentarily; in the form that I give execution time is a matter of seconds.
[2 Dec 2010 22:03] Sveta Smirnova
Thank you for the feedback.

Verified as described using sample data. Test case will be attached soon. MyISAM does not use index in both queries.
[2 Dec 2010 22:04] Sveta Smirnova
test case for the testsuite

Attachment: bug58660.test (application/octet-stream, text), 2.41 KiB.

[26 Mar 2012 19:32] Paul Dubois
Noted in 5.6.5 changelog.

Several subquery performance issues were resolved through the
implementation of semi-join subquery optimization strategies.