Bug #20199 | Join slower than exists | ||
---|---|---|---|
Submitted: | 1 Jun 2006 13:11 | Modified: | 8 Dec 2009 16:04 |
Reporter: | Andre Timmer | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 5.0.18, 6.0.14 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[1 Jun 2006 13:11]
Andre Timmer
[1 Jun 2006 17:41]
Jorge del Conde
Thanks for your bug report. I tested this under FC5: mysql> select count(*) from tab1; +----------+ | count(*) | +----------+ | 529911 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from tab2; +----------+ | count(*) | +----------+ | 529910 | +----------+ 1 row in set (0.00 sec) mysql> mysql> explain -> select count(*) -> from tab1 aa -> , tab2 bb -> where aa.id = bb.id; +----+-------------+-------+--------+---------------+---------+---------+------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+------------+--------+-------------+ | 1 | SIMPLE | bb | index | PRIMARY | PRIMARY | 4 | NULL | 529910 | Using index | | 1 | SIMPLE | aa | eq_ref | PRIMARY | PRIMARY | 4 | test.bb.id | 1 | Using index | +----+-------------+-------+--------+---------------+---------+---------+------------+--------+-------------+ 2 rows in set (0.00 sec) mysql> explain -> select count(*) -> from tab1 aa -> where exists ( -> select '' -> from tab2 bb -> where aa.id = bb.id -> ); +----+--------------------+-------+--------+---------------+---------+---------+------------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+--------+---------------+---------+---------+------------+--------+--------------------------+ | 1 | PRIMARY | aa | index | NULL | PRIMARY | 4 | NULL | 613689 | Using where; Using index | | 2 | DEPENDENT SUBQUERY | bb | eq_ref | PRIMARY | PRIMARY | 4 | test.aa.id | 1 | Using index | +----+--------------------+-------+--------+---------------+---------+---------+------------+--------+--------------------------+ 2 rows in set (0.01 sec) mysql> mysql> select count(*) -> from tab1 aa -> , tab2 bb -> where aa.id = bb.id; +----------+ | count(*) | +----------+ | 621794 | +----------+ 1 row in set (7.06 sec) mysql> select count(*) -> from tab1 aa -> where exists ( -> select '' -> from tab2 bb -> where aa.id = bb.id -> ); +----------+ | count(*) | +----------+ | 718545 | +----------+ 1 row in set (8.51 sec) mysql>
[1 Jun 2006 17:42]
Jorge del Conde
I reproduced this using 5.0.23bk
[1 Jun 2006 17:43]
Jorge del Conde
my results have a ~1.5 sec difference, but the difference is still considerable between both queries.
[31 Aug 2006 16:44]
Igor Babaev
This performance problem will be fixed in the frame of WL #2980 "Subquery optimizations".
[1 Dec 2009 4:26]
Valeriy Kravchuk
With 6.0.14 I've got the same results as Jorge long time ago: mysql> select count(*) from tab1; +----------+ | count(*) | +----------+ | 1000000 | +----------+ 1 row in set (1.00 sec) mysql> select count(*) from tab2; +----------+ | count(*) | +----------+ | 1000000 | +----------+ 1 row in set (0.19 sec) mysql> mysql> explain -> select count(*) -> from tab1 aa -> , tab2 bb -> where aa.id = bb.id; +----+-------------+-------+--------+---------------+---------+---------+------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+------------+---------+-------------+ | 1 | SIMPLE | aa | index | PRIMARY | PRIMARY | 4 | NULL | 1000000 | Using index | | 1 | SIMPLE | bb | eq_ref | PRIMARY | PRIMARY | 4 | test.aa.id | 1 | Using index | +----+-------------+-------+--------+---------------+---------+---------+------------+---------+-------------+ 2 rows in set (0.00 sec) mysql> explain -> select count(*) -> from tab1 aa -> where exists ( -> select '' -> from tab2 bb -> where aa.id = bb.id -> ); +----+--------------------+-------+--------+---------------+---------+---------+------------+---------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+--------+---------------+---------+---------+------------+---------+--------------------------+ | 1 | PRIMARY | aa | index | NULL | PRIMARY | 4 | NULL | 1000000 | Using where; Using index | | 2 | DEPENDENT SUBQUERY | bb | eq_ref | PRIMARY | PRIMARY | 4 | test.aa.id | 1 | Using index | +----+--------------------+-------+--------+---------------+---------+---------+------------+---------+--------------------------+ 2 rows in set (0.00 sec) mysql> select count(*) -> from tab1 aa -> , tab2 bb -> where aa.id = bb.id; +----------+ | count(*) | +----------+ | 1000000 | +----------+ 1 row in set (22.04 sec) mysql> select count(*) -> from tab1 aa -> where exists ( -> select '' -> from tab2 bb -> where aa.id = bb.id -> ); +----------+ | count(*) | +----------+ | 1000000 | +----------+ 1 row in set (27.63 sec) mysql> select version(); +--------------------+ | version() | +--------------------+ | 6.0.14-alpha-debug | +--------------------+ 1 row in set (0.00 sec) So, join is faster(!). What exactly is a problem here, other than a request for further, more exotic optimization maybe?
[1 Dec 2009 13:49]
Andre Timmer
Thank you for testing. Let's say it's indeed a wish for an exotic tuning optimization. Maybe good idea to close this call.
[1 Dec 2009 14:13]
Valeriy Kravchuk
As join is faster than subquery with exists now, I'd say that optimizer should be made clever enough, eventually, to rewrite the query using join (or, at least, consider this way of optimization among others).
[8 Dec 2009 16:05]
Omer Barnir
Duplicate of bug#24972 (based on Valeriy's comment above