Bug #78612 Optimizer chooses wrong index for ORDER BY
Submitted: 28 Sep 2015 22:46 Modified: 30 Sep 2015 17:46
Reporter: Sveta Smirnova (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.5.45, 5.6.26, 5.7.8,5.5.47, 5.6.28,5.7.10 OS:Any
Assigned to: CPU Architecture:Any

[28 Sep 2015 22:46] Sveta Smirnova
Description:
When querying a table on two fields, one of which has much more rows which satisfy condition than another which also has partial index on it, index used for the former (with more rows).

E.g. for general case WHERE f1=X AND f2=Y ORDER BY f2 DESC LIMIT 1;
f1 has 64 rows, equal to X and f2 has 320 rows, equal to Y, optimizer chooses index on f2 and not partial index on f1 which gives better results.

This can lead to high query execution times, especially in cases when table has thousands of rows which satisfy condition f2=Y and no rows, which satisfy condition f1=X

How to repeat:
File with test case will be attached shortly.
[28 Sep 2015 22:46] Sveta Smirnova
test case for MTR

Attachment: bug1500639.test (application/octet-stream, text), 10.92 KiB.

[29 Sep 2015 9:07] Umesh Shastry
Hello Sveta,

Thank you for the report and test case.
Verified as described with 5.5.47, 5.6.28 and 5.7.10 builds.

Thanks,
Umesh
[30 Sep 2015 12:12] Øystein Grøvlen
Hi Sveta,

I am not sure I understand why you think this is related to "ORDER BY DESC".  I get the same plans if I remove "DESC" from the queries. 

AFAICT, the issue is related to the use of a prefix index.  If I index the entire varchar column or comnpare with a literal that is shorter than the prefix, the index will be used.  

As this example shows, this is a more general issue:

create table t2(i int, c varchar(10), key cp (c(10));

<insert some data>

mysql> explain select * from t2 where c='abcefgh';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t2    | NULL       | ALL  | cp            | NULL | NULL    | NULL |   33 |   100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0,00 sec)

mysql> explain select * from t2 where c='ab';
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | t2    | NULL       | ref  | cp            | cp   | 6       | const |    1 |   100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0,00 sec)

I have not checked whether this limitation is documented.
[30 Sep 2015 12:15] Øystein Grøvlen
Sorry, the given table definition above was wrong.  What I used was:

create table t2(i int, c varchar(10), key cp (c(3))); 

The point is that the string literal of the first query is longer than the prefix, and because of this the index will not be used.
[30 Sep 2015 12:29] Sveta Smirnova
Øystein,

because in this case value of Handler_read_prev is increasing and our manual says about this option: 'The number of requests to read the previous row in key order. This read method is mainly used to optimize ORDER BY ... DESC.' (http://dev.mysql.com/doc/refman/5.6/en/server-status-variables.html#statvar_Handler_read_p...)

Also, issue here is the fact what my table contains 64 rows where f1 like 'testcancel%' and 320 rows which satisfy another condition. In real customer case it was 7777 vs 169999 So regardless if index can be used to resolve condition 'testcancelMANY OTHER LETTERS' it would be still more effective to retrieve all rows which satisfy condition 'testcancel%', then perform filesort on them.
[30 Sep 2015 12:41] Øystein Grøvlen
If you replace DESC with ASC, Handler_read_next will increase instead of Handler_read_prev.  I do not see how that makes it a different problem.

I see that it would be useful to be able to use a prefix index also for longer literals.  However, this does not seem to be supported by MySQL in even the simplest case.  Hence, this report should probably be reclassified from bug report to feature request.
[30 Sep 2015 13:15] Sveta Smirnova
Make sense regarding to DESC.

Regarding if this should be feature request or not I will leave it for Oracle engineers do decide.
[30 Sep 2015 17:46] Sveta Smirnova
One more comment: EXPLAIN for select * from t1 where f1='testcancel+@foobar.com' and (f2!=20 and f2!=30 and f2<10000) order by f2 desc limit 1; says only 5 rows would be examined, while statistics show what all rows, which satisfy condition (f2!=20 and f2!=30 and f2<10000)  were examined.

explain select * from t1 where f1='testcancel+@foobar.com' and (f2!=20 and f2!=30 and f2<10000) order by f2 desc limit 1;
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    filtered        Extra
1       SIMPLE  t1      NULL    index   email,f2        f2      5       NULL    5       20.00   Using where
[21 Oct 2018 9:21] Jean-François Gagné
This also affects me.  Below is the bug I reported for Percona Server 5.7.22 and 5.7.23, MySQL also affected according to Percona:
https://jira.percona.com/browse/PS-4935

However, in my case, there is a single bigint in the WHERE clause and the ORDER BY is on the Primary Key.