| Bug #50843 | Optimizer ignores clustered index for GROUP BY even when it improves performance | ||
|---|---|---|---|
| Submitted: | 2 Feb 2010 18:12 | Modified: | 20 Jun 2010 22:33 |
| Reporter: | Chris Calender | Email Updates: | |
| Status: | Closed | ||
| Category: | Server: Optimizer | Severity: | S2 (Serious) |
| Version: | 5.1.42 | OS: | Any |
| Assigned to: | Evgeny Potemkin | Target Version: | 5.1+ |
| Tags: | Optimizer, regression, clustered index, GROUP BY | ||
| Triage: | Triaged: D2 (Serious) / R2 (Low) / E1 (None/Negligible) | ||
[2 Feb 2010 18:27]
Chris Calender
This also fails in 5.1.35.
[2 Feb 2010 19:15]
Sinisa Milivojevic
This bug is not a regression of the fix for the bug #45828. The reasons are explained in the comments of the bug #46011.
[2 Feb 2010 19:51]
Chris Calender
This also fails in 5.1.36.
[3 Feb 2010 19:47]
Sinisa Milivojevic
A diagnosis of the bug is posted in the bug #46011.
[12 Feb 2010 14:26]
Sinisa Milivojevic
Bug #46011 contains info on my final testing of the patch.
[23 Feb 2010 19:04]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/101247 3354 Evgeny Potemkin 2010-02-23 Bug#50843: Filesort used instead of clustered index led to performance degradation. Filesort + join cache combination is preferred to full index scan because it is usually faster. But it's not the case when the index is clustered one. Now test_if_skip_sort_order function prefers filesort only if index isn't clustered. @ mysql-test/r/innodb_mysql.result Added a test case for the bug#50843. @ mysql-test/t/innodb_mysql.test Added a test case for the bug#50843. @ sql/sql_select.cc Bug#50843: Filesort used instead of clustered index led to performance degradation. Now test_if_skip_sort_order function prefers filesort only if index isn't clustered.
[25 Feb 2010 7:29]
Tor Didriksen
I would prefer to apply DeMorgan to the last part of the predicate, I find this slightly more readable: + if ((select_limit >= table_records) && + (tab->type == JT_ALL && + tab->join->tables > tab->join->const_tables + 1) && + ((unsigned) best_key != table->s->primary_key || + !table->file->primary_key_is_clustered()))
[25 Feb 2010 15:38]
Sinisa Milivojevic
I fully concur with Tor ... improves readability ...
[26 Feb 2010 11:17]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/101581 3354 Evgeny Potemkin 2010-02-26 Bug#50843: Filesort used instead of clustered index led to performance degradation. Filesort + join cache combination is preferred to full index scan because it is usually faster. But it's not the case when the index is clustered one. Now test_if_skip_sort_order function prefers filesort only if index isn't clustered. @ mysql-test/r/innodb_mysql.result Added a test case for the bug#50843. @ mysql-test/t/innodb_mysql.test Added a test case for the bug#50843. @ sql/sql_select.cc Bug#50843: Filesort used instead of clustered index led to performance degradation. Now test_if_skip_sort_order function prefers filesort only if index isn't clustered.
[26 Feb 2010 11:20]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/101585 3358 Evgeny Potemkin 2010-02-26 [merge] Auto-merged fox for the bug#50843.
[1 Mar 2010 8:47]
Bugs System
Pushed into 5.1.45 (revid:joro@sun.com-20100301083827-xnimmrjg6bh33o1o) (version source revid:sergey.glukhov@sun.com-20100226121905-cadcd8ayvcka9c6y) (merge vers: 5.1.45) (pib:16)
[2 Mar 2010 14:34]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100302142746-u1gxdf5yk2bjrq3e) (version source revid:alik@sun.com-20100301095421-4cz64ibem1h2quve) (merge vers: 6.0.14-alpha) (pib:16)
[2 Mar 2010 14:39]
Bugs System
Pushed into 5.5.3-m2 (revid:alik@sun.com-20100302072233-t3uqgjzdukt1pyhe) (version source revid:alexey.kopytov@sun.com-20100226130631-8czhisohzf6jyo2x) (merge vers: 5.5.3-m2) (pib:16)
[2 Mar 2010 14:44]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100302072432-k8xvfkgcggkwgi94) (version source revid:alik@sun.com-20100301093944-a4rvrmqqco6c0qao) (pib:16)
[8 Mar 2010 1:24]
Paul DuBois
Noted in 5.1.45, 5.5.3, 6.0.14 changelogs. Use of filesort plus the join cache normally is preferred to a full index scan. But it was used even if the index is clustered, in which case, the clustered index scan can be faster.
[26 Mar 2010 8:51]
Gleb Shchepa
Also see bug #52361.
[17 Jun 2010 12:08]
Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:55]
Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:martin.skold@mysql.com-20100609140708-52rvuyq4q500sxkq) (merge vers: 5.1.45-ndb-6.2.19) (pib:16)
[17 Jun 2010 13:36]
Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Description: The optimizer in MySQL 5.1 ignores the clustered index (of a PRIMARY KEY) for GROUP BY even when it would improve performance of the query. For an example, execute the following query in 5.1 and 5.0 to compare (note I am using the `world` database, with the tables converted to InnoDB): EXPLAIN SELECT country.* FROM country LEFT JOIN countrylanguage ON country.code=countrylanguage.countrycode GROUP BY country.code; In 5.0, you will see that the PRIMARY KEY for table `country` is used to resolve this. However, it is not used in 5.1, which leads to "Using temporary; Using filesort" in order to resolve the query. How to repeat: 1. Load the `world` database. 2. ALTER `country` and `countrylanguage` to be InnoDB tables ALTER TABLE `country` ENGINE=InnoDB; ALTER TABLE `countrylanguage` ENGINE=InnoDB; 3. Run the following query: EXPLAIN SELECT country.* FROM country LEFT JOIN countrylanguage ON country.code=countrylanguage.countrycode GROUP BY country.code; Output from 5.1.42: mysql> EXPLAIN SELECT country.* FROM country LEFT JOIN countrylanguage ON country.code=countrylanguage.countrycode GROUP BY country.code; +----+-------------+-----------------+------+---------------+---------+---------+--------------------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------------+------+---------------+---------+---------+--------------------+------+---------------------------------+ | 1 | SIMPLE | country | ALL | NULL | NULL | NULL | NULL | 218 | Using temporary; Using filesort | | 1 | SIMPLE | countrylanguage | ref | PRIMARY | PRIMARY | 3 | world.country.Code | 2 | Using index | +----+-------------+-----------------+------+---------------+---------+---------+--------------------+------+---------------------------------+ 2 rows in set (0.00 sec) Output from 5.0.89: mysql> EXPLAIN SELECT country.* FROM country LEFT JOIN countrylanguage ON country.code=countrylanguage.countrycode GROUP BY country.code; +----+-------------+-----------------+-------+---------------+---------+---------+--------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------------+-------+---------------+---------+---------+--------------------+------+-------------+ | 1 | SIMPLE | country | index | NULL | PRIMARY | 3 | NULL | 237 | | | 1 | SIMPLE | countrylanguage | ref | PRIMARY | PRIMARY | 3 | world.country.Code | 2 | Using index | +----+-------------+-----------------+-------+---------------+---------+---------+--------------------+------+-------------+ 2 rows in set (0.02 sec) Suggested fix: Allow the optimizer to use the clustered index when the GROUP BY is on the Primary Key (i.e., the clustered index).