Bug #18826 Using subquery in "in (...)" clause forces full table scan
Submitted: 6 Apr 2006 0:19 Modified: 23 Jan 2014 9:46
Reporter: time e.less Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.0.16 OS:Linux (Debian Linux AMD 64)
Assigned to: CPU Architecture:Any

[6 Apr 2006 0:19] time e.less
Description:
mysql> explain select count(*) as n from digs where itemid in (select id from
       items where userid = 249778 and deleted = 0);
+----+--------------------+-------+---------+------+----------+--------------------------+
| id | select_type        | table | key     | ref  | rows     | Extra                    |
+----+--------------------+-------+---------+------+----------+--------------------------+
|  1 | PRIMARY            | digs  | item    | NULL | 14932299 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | items | PRIMARY | func |        1 | Using index; Using where |
+----+--------------------+-------+---------+------+----------+--------------------------+

mysql> explain select count(*) as n from digs d,items i where d.itemid = i.id
       and i.userid = 249778 and i.deleted = 0;
+----+-------------+-------+-----------+-----------------+------+--------------------------+
| id | select_type | table | key       | ref             | rows | Extra                    |
+----+-------------+-------+-----------+-----------------+------+--------------------------+
|  1 | SIMPLE      | i     | userindex | const           |    1 | Using where              |
|  1 | SIMPLE      | d     | item      | production.i.id |   30 | Using where; Using index |
+----+-------------+-------+-----------+-----------------+------+--------------------------+

For some reason the first version, using an in(...) clause forces a full table scan, even though the explain claims it'll use the index "item" on the "itemid" column in the "digs" table.

How to repeat:
Use a subquery in the in() clause in 5.0.16.

Suggested fix:
I don't know if it's too much to ask for subqueries inside in() clauses to use indexes for the part of the query on the outside.

I actually bet you'd have fixed this in 5.0.17-5.0.19, but couldn't find anything in the bug database about this type of problem.

There is another crashing bug introduced in 5.0.18 (or perhaps 5.0.17) with InnoDB that prevents us from upgrading to 5.0.18 (unsure if 5.0.19 will work for us as yet).
[6 Apr 2006 10:47] Valeriy Kravchuk
Thank you for a problem report. This is a well-known problem, that will be fixed eventually (in MySQL 5.2, I hope).  It is not fixed in 5.0.19, 5.0.21-BK etc. Most subqueries are optimized that way now, as correlated. See bug #12106, for example.
[22 Jan 2008 23:53] Sergey Petrunya
This subquery should be handled by new subquery optimizations in MySQL 6.0.

time e.less, if you still have the sample data, could you please check that 6.0 is faster now? Alternatively, you could upload the test dataset and we will make sure it will run fast in 6.0.
[4 Nov 2008 18:50] Valeriy Kravchuk
Please, check with MySQL 6.0.7 and inform about the results.
[5 Dec 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[23 Jan 2014 9:46] Roy Lyseng
This problem should be fixed in 5.6 with semi-join conversions.

If there is still problems, please re-open the bug report.