| Bug #24972 | Join and sub-query execution differences | ||
|---|---|---|---|
| Submitted: | 11 Dec 2006 21:55 | Modified: | 5 Sep 2012 16:35 |
| Reporter: | Jim Bullington | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
| Version: | 5.0.27, 6.0.14 | OS: | Any (Windows) |
| Assigned to: | CPU Architecture: | Any | |
[11 Dec 2006 22:45]
MySQL Verification Team
Thank you for the bug report. This is a well known problem with sub-queries. Will be fixed in MySQL 5.2 (work in progress already).
[13 Dec 2006 21:11]
Jim Bullington
My goodness!! 5.2 is a long ways off! Does the commercial MySQL version suffer from the same problem?
[30 Nov 2009 19:28]
Valeriy Kravchuk
The problem is, actually, even worse with 6.0.14:
mysql> select b
-> from t1
-> where a in (select c from t2 where d = 1000)
-> order by b;
+------+
| b |
+------+
| 1517 |
| 2187 |
| 2417 |
| 2556 |
| 2644 |
| 2796 |
| 3106 |
| 3220 |
| 3785 |
| 4782 |
| 5975 |
| 8733 |
+------+
12 rows in set (8 min 11.35 sec)
mysql> explain select b
-> from t1
-> where a in (select c from t2 where d = 1000)
-> order by b;
+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
| 1 | PRIMARY | t1 | ALL | PRIMARY | NULL | NULL | NULL | 1001 | Using temporary; Using filesort |
| 1 | PRIMARY | t2 | ALL | NULL | NULL | NULL | NULL | 100001 | Using where; FirstMatch(t1) |
+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
2 rows in set (0.00 sec)
mysql> select version();
+--------------------+
| version() |
+--------------------+
| 6.0.14-alpha-debug |
+--------------------+
1 row in set (0.00 sec)
mysql> explain select b
-> from t1
-> join t2 on t2.c = t1.a
-> where d = 1000
-> order by b;
+----+-------------+-------+--------+---------------+---------+---------+-----------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+-----------+--------+----------------------------------------------+
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 100001 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | t1 | eq_ref | PRIMARY | PRIMARY | 4 | test.t2.c | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+-----------+--------+----------------------------------------------+
2 rows in set (0.01 sec)
mysql> select b from t1 join t2 on t2.c = t1.a where d = 1000 order by b;
+------+
| b |
+------+
| 1517 |
| 2187 |
| 2417 |
| 2556 |
| 2644 |
| 2796 |
| 3106 |
| 3220 |
| 3785 |
| 4782 |
| 5975 |
| 8733 |
+------+
12 rows in set (0.07 sec)
[5 Sep 2012 16:35]
Paul DuBois
Fixed with subquery optimizations in 5.6.5.

Description: There appears to be a large differential between a join query and a sub-query that produce equivalent results: t1 has 1000 records, t2 has 100000 records select b from t1 join t2 on t2.c = t1.a where d = 1000 order by b; Completes in 0.1 seconds However, select b from t1 where a in (select c from t2 where d = 1000) order by b; Completes in 44 seconds Should there be such a large performance difference between these 2 equivalent queries? How to repeat: create table t1 ( a int, b int, primary key(a) ); delimiter $ create procedure build_t1(p1 int) begin set @x = 0; repeat set @x = @x + 1; insert into t1 values(@x, floor(1+(rand()*10000))); until @x > p1 end repeat; end $ delimiter ; call build_t1(1000); create table t2 ( a int, c int, d int, primary key(a) ); delimiter $ create procedure build_t2(p1 int) begin set @x = 0; repeat set @x = @x + 1; insert into t2 values(@x, floor(1+(rand()*1000)), floor(1+(rand()*10000))); until @x > p1 end repeat; end $ delimiter ; call build_t2(100000); select b from t1 join t2 on t2.c = t1.a where d = 1000 order by b; select b from t1 where a in (select c from t2 where d = 1000) order by b;