| Bug #42094 | Optimizer chooses a wrong index when group by, order by and limit is present | ||
|---|---|---|---|
| Submitted: | 13 Jan 2009 23:01 | Modified: | 11 Apr 2020 2:07 |
| Reporter: | Alexander Rubin | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 5.1 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | GROUP BY, INDEX, limit, Optimizer, order by, regression | ||
[13 Jan 2009 23:12]
Alexander Rubin
Uploaded ftp.mysql.com/pub/mysql/upload/t1.sql.gz
[14 Jan 2009 5:16]
Valeriy Kravchuk
Thank you for a problem report. Why do you think this is a regression? What exact previous version that worked properly you had identified?
[14 Jan 2009 14:51]
Alexander Rubin
Valeriy, here is the same query in MySQL 5.0.67:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.0.67 MySQL Community Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> explain select groupid, count(wordid) as cnt from t1 where wordid=4703 group by
-> groupid order by cnt desc limit 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: const
rows: 301926
Extra: Using where; Using index; Using temporary; Using filesort
1 row in set (0.00 sec)
So it chooses the right index in MySQL 5.0 and wrong in MySQL 5.1 for the same query.
[3 Jun 2009 22:40]
Josh Orr
I just wanted to mention that I ma having this same exact problem. In my case, I order by the Primary key and in mysql 5.0 it uses the right index to restrict the result set before ordering the result. But in 5.1, it now does a full table scan (due to it using the primary key as the index to use) and filtering the result set. I think it does this so it can read the rows in the order that it needs to sort it by, but the result is it going though 7 million rows instead of 45 rows. As you can imagine, it is order's of magnitude slower.
In the examples below, packageID is the primary key, and batchID has an index on it. On mysql 5.0, the execution plain are the same for both (they both use the batcHID index).
Some basic examples executed with mysql 5.1:
This one takes 13.54 seconds to execute:
EXPLAIN SELECT packageID, Zip FROM packages WHERE batchID = 0 ORDER BY packageID;
id: 1
select_type: SIMPLE
table: packages
type: index
possible_keys: batchID
key: PRIMARY
key_len: 4
ref: NULL
rows: 7763865
Extra: Using where
This one takes .04 seconds to execute:
EXPLAIN SELECT packageID, Zip FROM packages WHERE batchID = 0 ORDER BY packageID;
id: 1
select_type: SIMPLE
table: packages
type: ref
possible_keys: batchID
key: batchID
key_len: 8
ref: const
rows: 14046
Extra:
[4 Jun 2009 23:25]
Josh Orr
Sorry, I pasted in the wrong query. The second query should read like this: EXPLAIN SELECT packageID, Zip FROM packages WHERE batchID = 0; It's the same query, except without the order by.
[12 Jul 2010 11:01]
Tomer Ratz
I ran across the same issue, on support Issue #49091. The wrong index is chosen, and a full table scan is being performed.
[9 Sep 2010 12:38]
Andrii Nikitin
bug #50394 may explain reasons for this bug, or at least related
[8 Nov 2011 13:12]
Jochen Riehm
We are frequently experiencing this problem as well. Is there any more global workaround besides adding USE INDEX to every problematic query?

Description: the following query: select groupid, count(wordid) as cnt from t1 where wordid=4703 group by groupid order by cnt desc limit 10 uses an index on groupid and performs a full table scan + filesort, instead of using index on wordid (primary) (here wordid=4703 is a very frequent ID, which returns 40% of rows) mysql> show create table t1\G *************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `type` smallint(5) unsigned NOT NULL, `groupid` int(10) unsigned NOT NULL, `wordid` int(10) unsigned NOT NULL, `source` smallint(5) unsigned NOT NULL, PRIMARY KEY (`wordid`,`source`,`groupid`,`type`), KEY `groupid` (`groupid`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql> explain select groupid, count(wordid) as cnt from t1 where wordid=4703 group by groupid order by cnt desc limit 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t1 type: index possible_keys: PRIMARY key: groupid key_len: 4 ref: NULL rows: 50 Extra: Using where; Using index; Using temporary; Using filesort 1 row in set (0.00 sec) Here MySQL chooses an index on groupid (incorrect). However, we have "order by cnt" clause in the query - in this situation mysql will have to perform a full table scan and filesort (see Extra field). mysql> explain select groupid, count(wordid) as cnt from t1 where wordid=4703 group by groupid order by cnt desc\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t1 type: ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 558642 Extra: Using where; Using index; Using temporary; Using filesort 1 row in set (0.00 sec) If we do not have a LIMIT mysql chooses index on PRIMARY (correct) mysql> explain select groupid, count(wordid) as cnt from t1 where wordid=4703 group by groupid limit 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t1 type: index possible_keys: PRIMARY key: groupid key_len: 4 ref: NULL rows: 50 Extra: Using where; Using index 1 row in set (0.00 sec) if we have no "order by", mysql will use groupid with no filesort and temporary table (correct). Speed comparison: mysql> select groupid, count(wordid) as cnt from t1 where wordid=4703 group by groupid order by cnt desc limit 10; +---------+-----+ | groupid | cnt | +---------+-----+ | 350 | 62 | | 530 | 62 | | 522 | 60 | | 361 | 59 | | 791 | 57 | | 676 | 57 | | 242 | 56 | | 508 | 55 | | 446 | 55 | | 289 | 55 | +---------+-----+ 10 rows in set (0.40 sec) mysql> select groupid, count(wordid) as cnt from t1 use index (PRIMARY) where wordid=4703 group by groupid order by cnt desc limit 10; +---------+-----+ | groupid | cnt | +---------+-----+ | 530 | 62 | | 350 | 62 | | 522 | 60 | | 361 | 59 | | 676 | 57 | | 791 | 57 | | 242 | 56 | | 446 | 55 | | 289 | 55 | | 677 | 55 | +---------+-----+ 10 rows in set (0.28 sec) How to repeat: Table data attached Please see explains and sql queries above. Suggested fix: Fix the optimizer, so it will choose the correct index when doing "order by <func>"