Description:
When there are two indexes i1(a, b, c) and i2(a, c), and fetching rows by "WHERE a=? ORDER BY c LIMIT N", it is expected that i2 is used because it can skip filesort. But there are cases that i1 is used even though using i1 is much more costly than using i2.
How to repeat:
create table a (id int auto_increment primary key,
k1 int, k2 int, k3 int, value int,
index i1 (k2, k1, k3), index i2 (k2, k3)) engine=innodb;
Then insert lots of rows with low enough cardinality on k2 (I tested inserting 2.6 mil rows, cardinality on k2 was 1000):
insert into a (k1, k2, k3, value)values (1,1,1,1), (2,2,2,2),(3,3,3,3),(4,4,4,4),(5,5,5,5),(6,6,6,6),(7,7,7,7),(8,8,8,8),(9,9,9,9),(10,10,10,10);
insert into a (k1, k2, k3, value) select rand()*1000000, rand()*1000, rand()*1000000, value from a;
insert into a (k1, k2, k3, value) select rand()*1000000, rand()*1000, rand()*1000000, value from a;
insert into a (k1, k2, k3, value) select rand()*1000000, rand()*1000, rand()*1000000, value from a;
....
insert into a (k1, k2, k3, value) select rand()*1000000, rand()*1000, rand()*1000000, value from a;
Query OK, 1310720 rows affected (51.06 sec)
Records: 1310720 Duplicates: 0 Warnings: 0
mysql> select count(*) from a;
+----------+
| count(*) |
+----------+
| 2621440 |
+----------+
1 row in set (0.67 sec)
mysql> explain select * from a where k2=5 order by k3 limit 2;
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | a | range | i1,i2 | i2 | 5 | NULL | 2647 | Using where |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
mysql> explain select * from a where k2=5 order by k3 limit 20;
+----+-------------+-------+------+---------------+------+---------+-------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+-------+------+-----------------------------+
| 1 | SIMPLE | a | ref | i1,i2 | i1 | 5 | const | 2647 | Using where; Using filesort |
+----+-------------+-------+------+---------------+------+---------+-------+------+-----------------------------+
1 row in set (0.00 sec)
mysql> select * from a where k2=4 order by k3 limit 20;
+---------+--------+------+------+-------+
| id | k1 | k2 | k3 | value |
+---------+--------+------+------+-------+
| 4 | 4 | 4 | 4 | 4 |
| 635834 | 673170 | 4 | 472 | 6 |
| 541921 | 7037 | 4 | 748 | 3 |
| 1041991 | 7076 | 4 | 956 | 3 |
| 1193818 | 5918 | 4 | 2591 | 10 |
| 742992 | 338342 | 4 | 2826 | 4 |
| 930253 | 338876 | 4 | 2854 | 5 |
| 2237774 | 5917 | 4 | 2891 | 1 |
| 2382521 | 672861 | 4 | 3035 | 8 |
| 2446520 | 338570 | 4 | 3503 | 7 |
| 1035281 | 671827 | 4 | 3715 | 3 |
| 2822829 | 338818 | 4 | 4423 | 6 |
| 1879635 | 337517 | 4 | 5373 | 2 |
| 1450486 | 337258 | 4 | 6417 | 8 |
| 742782 | 671587 | 4 | 6640 | 4 |
| 1195895 | 337235 | 4 | 6836 | 7 |
| 2502170 | 670918 | 4 | 7116 | 7 |
| 1452121 | 337298 | 4 | 7159 | 3 |
| 1206651 | 3813 | 4 | 7454 | 3 |
| 1391196 | 337931 | 4 | 7516 | 8 |
+---------+--------+------+------+-------+
20 rows in set (0.02 sec)
mysql> select * from a force index(i2) where k2=10 order by k3 limit 20;
+---------+--------+------+------+-------+
| id | k1 | k2 | k3 | value |
+---------+--------+------+------+-------+
| 10 | 10 | 10 | 10 | 10 |
| 1082738 | 349718 | 10 | 1020 | 10 |
| 2228755 | 16736 | 10 | 1631 | 2 |
| 2703161 | 349598 | 10 | 1735 | 8 |
| 2734181 | 682017 | 10 | 1819 | 8 |
| 1141563 | 349173 | 10 | 2634 | 5 |
| 191457 | 14954 | 10 | 3034 | 4 |
| 1148421 | 14874 | 10 | 3586 | 3 |
| 1198166 | 348124 | 10 | 3665 | 8 |
| 1765285 | 349295 | 10 | 4151 | 2 |
| 606304 | 348354 | 10 | 4565 | 6 |
| 1305867 | 348316 | 10 | 5005 | 9 |
| 1156394 | 15238 | 10 | 5433 | 6 |
| 2836014 | 348320 | 10 | 5458 | 1 |
| 1916490 | 681752 | 10 | 5701 | 7 |
| 626443 | 348718 | 10 | 6102 | 5 |
| 567895 | 681629 | 10 | 6158 | 7 |
| 1782889 | 348007 | 10 | 6774 | 6 |
| 2748233 | 681046 | 10 | 6987 | 10 |
| 348086 | 14948 | 10 | 7575 | 3 |
+---------+--------+------+------+-------+
20 rows in set (0.00 sec)
Suggested fix:
When I checked with a debugger, the following lines where executed to check whether i2 should be used or not.
sql/sql_select.cc test_if_cheaper_ordering(): (test_if_skip_sort_order() in 5.1)
if (keys.is_set(nr) &&
(direction= test_if_order_by_key(order, table, nr, &used_key_parts)))
...
/*
We assume that each of the tested indexes is not correlated
with ref_key. Thus, to select first N records we have to scan
N/selectivity(ref_key) index entries.
selectivity(ref_key) = #scanned_records/#table_records =
table->quick_condition_rows/table_records.
In any case we can't select more than #table_records.
N/(table->quick_condition_rows/table_records) > table_records
<=> N > table->quick_condition_rows.
*/
if (select_limit > table->quick_condition_rows)
select_limit= table_records;
else
select_limit= (ha_rows) (select_limit *
(double) table_records /
table->quick_condition_rows);
....
index_scan_time= select_limit/rec_per_key *
min(rec_per_key, table->file->scan_time());
if ((ref_key < 0 && is_covering) ||
(ref_key < 0 && (group || table->force_index)) ||
index_scan_time < read_time)
----
table_records: close to the number of rows (2.6M)
table->quick_condition_rows: estimated matched rows (2.6M / 1000 = 2.6K)
select_limit: originally the same as LIMIT N (20), finally select_limit * table_records / table->quick_condition_rows (20 * 2.6M / 2.6K = 20K)
index_scan_time: the same as the final select_limit value
read_time: (probably) the same as table->quick_condition_rows (2.6K)
read_time was the same as the estimated cost when using index i1. The MySQL optimizer decided that the cost of using i2 (index_scan_time: 20k) was much more expensive than the cost of using i1 (2.6k) so it chose i1.
But apparently this is wrong decision. I am not sure why select_limit(index_scan_time) was computed by multiplying table_records. Since it was index scan and sorting could be skipped, select_limit should be at least less than table->quick_condition_rows, shouldn't it?