Bug #73837 incorrect index used in 5.6, 5.7
Submitted: 8 Sep 2014 13:13 Modified: 23 Feb 2015 15:27
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.6.15 OS:Any
Assigned to: CPU Architecture:Any

[8 Sep 2014 13:13] Shane Bester
Description:
Given this output,  it is better to use index on column "a" instead of "b":

mysql> select count(*) from t1 where a LIKE '0.5555%' ;
+----------+
| count(*) |
+----------+
|      201 |
+----------+
1 row in set (0.37 sec)

mysql> select count(*) from t1 where b >= now()-interval 100 day;
+----------+
| count(*) |
+----------+
|   418121 |
+----------+
1 row in set (0.17 sec)

mysql> select count(*) from t1 where a LIKE '0.5555%'  AND b >= now()-interval 100 day;
+----------+
| count(*) |
+----------+
|       46 |
+----------+
1 row in set (0.00 sec)

However, we see index on "b" is used in reality:

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t1 where a LIKE '0.5555%'  AND b >= now()-interval 100 day ORDER BY b DESC LIMIT 100;
<cut>
46 rows in set (1.20 sec)

mysql> show status like 'handler%';
<cut>
| Handler_read_prev          | 418121 |
<cut>
18 rows in set (0.05 sec)

mysql>
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t1 force index(b) where a LIKE '0.5555%'  AND b >= now()-interval 100 day ORDER BY b DESC LIMIT 100;
<cut>
46 rows in set (0.65 sec)

mysql> show status like 'handler%';
<cut>
| Handler_read_prev          | 418121 |
<cut>
18 rows in set (0.00 sec)

mysql>
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t1 force index(a) where a LIKE '0.5555%'  AND b >= now()-interval 100 day ORDER BY b DESC LIMIT 100;
<cut>
46 rows in set (0.00 sec)

mysql> show status like 'handler%';
<cut>
| Handler_read_next          | 201   |
<cut>
18 rows in set (0.00 sec)

How to repeat:
drop table if exists t1;
create table t1(id bigint unsigned not null auto_increment primary key,a varchar(25),b timestamp,key(a),key(b)) engine=innodb;

insert into t1(a,b) values (rand(),now()),(rand(),now()),(rand(),now()),(rand(),now()),(rand(),now()),(rand(),now()),(rand(),now()),(rand(),now());
insert into t1(a,b) select rand(),now() - interval (rand()*500) day from t1,t1 t2,t1 t3,t1 t4,t1 t5,t1 t6,t1 t7;

set global innodb_stats_persistent_sample_pages=10000;
set global innodb_stats_transient_sample_pages=10000;
analyze table t1;

select count(*) from t1 where a LIKE '0.5555%' ;
select count(*) from t1 where b >= now()-interval 100 day;
select count(*) from t1 where a LIKE '0.5555%'  AND b >= now()-interval 100 day;
explain select * from t1 where a LIKE '0.5555%'  AND b >= now()-interval 100 day ORDER BY b DESC LIMIT 100;

flush status;
select * from t1 where a LIKE '0.5555%'  AND b >= now()-interval 100 day ORDER BY b DESC LIMIT 100;
show status like 'handler%';

flush status;
select * from t1 force index(b) where a LIKE '0.5555%'  AND b >= now()-interval 100 day ORDER BY b DESC LIMIT 100;
show status like 'handler%';

flush status;
select * from t1 force index(a) where a LIKE '0.5555%'  AND b >= now()-interval 100 day ORDER BY b DESC LIMIT 100;
show status like 'handler%';
[23 Feb 2015 15:27] Paul DuBois
Noted in 5.7.6 changelog.

For queries that combine ORDER BY with LIMIT, the optimizer may
switch to an index that applies to the ORDER BY. In some cases, the 
decision to switch was based on a heuristic rather than on cost. The 
optimizer now uniformly makes the decision whether to switch on a
cost basis. This should result in better performanance when switching
would cause a query to read an entire index or a large part of it to
find qualifying rows.