Description:
When optimizing ORDER BY on a clustered key with a LIMIT with another key range constraint present test_if_skip_sort_order() will prefer index scan (JT_NEXT) if the heuristic expectation of the number of rows that are getting filtered out during the scan is high enough. The expectation of what is high enough is calculated based on the assumption that the rate of filtering throughout the entire table would be no worse than the number of matched records in range over total number of records. This assumption does not always hold, and when it does not, the results could be disastrous for a query that would run fast if the range is used instead.
How to repeat:
Test case (in mysqltest format):
--source include/have_innodb.inc
use test;
create table t1(id int not null primary key,
other_id int not null,
s tinyint not null,
n int,
m int,
key(other_id,s,n)
) engine=innodb;
show create table t1;
let $i=10000;
let $id=1;
--disable_query_log
begin;
while ($i)
{
eval insert into t1 values($id,($id * $id * $id)/1000000000,$id>7000,$id+5,$id+10);
dec $i;
inc $id;
}
eval insert into t1 values($id*35,$id-10,1,3,5);
eval insert into t1 values($id*40,$id-100,1,3,5);
eval insert into t1 values($id*50,$id-3,1,3,5);
commit;
--enable_query_log
--vertical_results
let cutoff=955;
let limit=3;
eval select count(*) from t1 where other_id > $cutoff;
eval explain extended select * from t1 where other_id > $cutoff order by id desc limit $limit;
eval explain extended select * from t1 force key(other_id) where other_id > $cutoff order by id desc limit $limit;
Results:
use test;
create table t1(id int not null primary key,
other_id int not null,
s tinyint not null,
n int,
m int,
key(other_id,s,n)
) engine=innodb;
show create table t1;
Table Create Table
t1 CREATE TABLE `t1` (
`id` int(11) NOT NULL,
`other_id` int(11) NOT NULL,
`s` tinyint(4) NOT NULL,
`n` int(11) DEFAULT NULL,
`m` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `other_id` (`other_id`,`s`,`n`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
select count(*) from t1 where other_id > 955;
count(*) 154
explain extended select * from t1 where other_id > 955 order by id desc limit 3;
id 1
select_type SIMPLE
table t1
type index
possible_keys other_id
key PRIMARY
key_len 4
ref NULL
rows 181
filtered 85.08
Extra Using where
Warnings:
Level Note
Code 1003
Message select `test`.`t1`.`id` AS `id`,`test`.`t1`.`other_id` AS `other_id`,`test`.`t1`.`s` AS `s`,`test`.`t1`.`n` AS `n`,`test`.`t1`.`m` AS `m` from `test`.`t1` where (`test`.`t1`.`other_id` > 955) order by `test`.`t1`.`id` desc limit 3
explain extended select * from t1 force key(other_id) where other_id > 955 order by id desc limit 3;
id 1
select_type SIMPLE
table t1
type range
possible_keys other_id
key other_id
key_len 4
ref NULL
rows 154
filtered 100.00
Extra Using where; Using filesort
Warnings:
Level Note
Code 1003
Message select `test`.`t1`.`id` AS `id`,`test`.`t1`.`other_id` AS `other_id`,`test`.`t1`.`s` AS `s`,`test`.`t1`.`n` AS `n`,`test`.`t1`.`m` AS `m` from `test`.`t1` FORCE INDEX (`other_id`) where (`test`.`t1`.`other_id` > 955) order by `test`.`t1`.`id` desc limit 3
drop table t1;
The optimizer thinks that it would only need to examine 181 rows before it will stop due to LIMIT, but good luck with that! It will be chugging along through almost every record in the table because other_id is perfectly correlated with id contrary to the assumption that the optimizer makes per the following comment in sql_select.cc and the code to follow:
/*
We assume that each of the tested indexes is not correlated
with ref_key. Thus, to select first N records we have to scan
N/selectivity(ref_key) index entries.
selectivity(ref_key) = #scanned_records/#table_records =
table->quick_condition_rows/table_records.
In any case we can't select more than #table_records.
N/(table->quick_condition_rows/table_records) > table_records
<=> N > table->quick_condition_rows.
*/
if (select_limit > table->quick_condition_rows)
select_limit= table_records;
else
select_limit= (ha_rows) (select_limit *
(double) table_records /
table->quick_condition_rows);
Suggested fix:
Since this is an unsafe optimization there should be a way to disable it with a my.cnf dynamic variable.