Bug #43052 not optimalised (sub)query, = versus IN
Submitted: 20 Feb 2009 9:49 Modified: 20 Feb 2009 11:21
Reporter: Hans Ginzel Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.30-community, 4.1, 5.0, 5.1, 6.0 bzr OS:Any (MS Windows XP, Linux)
Assigned to: CPU Architecture:Any
Tags: equal, IN, INDEX, optimise

[20 Feb 2009 9:49] Hans Ginzel
Description:
See the differences in EXPLAIN. In the second case subquery is not analysed and primary index is not used.

--------------
EXPLAIN SELECT Id, Time_Stamp FROM t WHERE Id  = (SELECT MAX(Id) FROM t)
--------------

+----+-------------+-------+-------+---------------+---------+---------+-------+------+------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | Extra                        |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+------------------------------+
|  1 | PRIMARY     | t     | const | PRIMARY       | PRIMARY | 4       | const |    1 |                              | 
|  2 | SUBQUERY    | NULL  | NULL  | NULL          | NULL    | NULL    | NULL  | NULL | Select tables optimized away | 
+----+-------------+-------+-------+---------------+---------+---------+-------+------+------------------------------+

EXPLAIN SELECT Id, Time_Stamp FROM t WHERE Id IN (SELECT MAX(Id) FROM t)

+----+--------------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type        | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+--------------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | PRIMARY            | t     | ALL  | NULL          | NULL | NULL    | NULL |    2 | Using where                  | 
|  2 | DEPENDENT SUBQUERY | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away | 
+----+--------------------+-------+------+---------------+------+---------+------+------+------------------------------+

How to repeat:
DROP TABLE IF EXISTS t;
CREATE TABLE t (
	Id		int UNSIGNED	PRIMARY KEY,
	Time_Stamp	timestamp
);

INSERT INTO t (Id) VALUES (1), (2);
SELECT * FROM t;

EXPLAIN SELECT Id, Time_Stamp FROM t WHERE Id  = (SELECT MAX(Id) FROM t);
EXPLAIN SELECT Id, Time_Stamp FROM t WHERE Id IN (SELECT MAX(Id) FROM t);

Suggested fix:
In queries with the IN operator:

1) Simple subqueries would be analysed (estimation of records count).

2) Index should be used.
[20 Feb 2009 11:21] Sveta Smirnova
Thank you for the report.

Verified as described.