Bug #104282 order by index is not used with low selectivity fields and similar idx
Submitted: 12 Jul 2021 8:14 Modified: 13 Jul 2021 8:33
Reporter: Nikolai Ikhalainen Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.25 OS:Any
Assigned to: CPU Architecture:Any

[12 Jul 2021 8:14] Nikolai Ikhalainen
Description:

Query with order by 
select * from tst WHERE c=1 and d1=1 and d2=2 and d3=3 order by c1 desc;

is not always using index for sorting, if there is a similar index with same fields + one filed preventing sort and d1,d2,d3 fields are not selective.

How to repeat:
drop table if exists tst;
set session information_schema_stats_expiry = 0;
create table tst(id int auto_increment primary key, c int not null default 0, c1 int not null default 0, c2 int not null default 0,d1 int not null default '0', d2 int not null default '0', d3 int not null default '0', d4 int not null default '0');
insert into tst(id,c,c1,c2,d1,d2,d3,d4) select NULL, 0,0,0,1,2,3,4 from mysql.time_zone_transition LIMIT 100000;
update tst set c=id%10, c1=id, c2=id, d1=1,d2=2,d3=3,d4=4;
alter table tst add key c_c1_index(c,d1,d2,d3,c1), add key c_c2_index(c,d1,d2,d3,c2,c1);
analyze table tst;
explain select * from tst WHERE c=1 and d1=1 and d2=2 and d3=3 order by c1 desc;
update mysql.innodb_index_stats SET stat_value=30 WHERE table_name='tst' and index_name='c_c2_index' and stat_name IN ('n_diff_pfx01','n_diff_pfx02','n_diff_pfx03','n_diff_pfx04');
flush table tst;
explain select * from tst WHERE c=1 and d1=1 and d2=2 and d3=3 order by c1 desc;

First explain shows:
+----+-------------+-------+------------+------+-----------------------+------------+---------+-------------------------+------+----------+---------------------+
| id | select_type | table | partitions | type | possible_keys         | key        | key_len | ref                     | rows | filtered | Extra               |
+----+-------------+-------+------------+------+-----------------------+------------+---------+-------------------------+------+----------+---------------------+
|  1 | SIMPLE      | tst   | NULL       | ref  | c_c1_index,c_c2_index | c_c1_index | 16      | const,const,const,const | 9081 |   100.00 | Backward index scan |
+----+-------------+-------+------------+------+-----------------------+------------+---------+-------------------------+------+----------+---------------------+
Second explain:
+----+-------------+-------+------------+------+-----------------------+------------+---------+-------------------------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys         | key        | key_len | ref                     | rows | filtered | Extra          |
+----+-------------+-------+------------+------+-----------------------+------------+---------+-------------------------+------+----------+----------------+
|  1 | SIMPLE      | tst   | NULL       | ref  | c_c1_index,c_c2_index | c_c2_index | 16      | const,const,const,const | 3330 |   100.00 | Using filesort |
+----+-------------+-------+------------+------+-----------------------+------------+---------+-------------------------+------+----------+----------------+

Suggested fix:
Check if the index for order by exists even if there are low selectivity columns.

The index scan is always selected for high selectivity columns. Without d1,d2,d3 columns, c1 index is prefered regardless to n_diff_pfx01 difference for c_c1_index,c_c2_index
[13 Jul 2021 8:33] MySQL Verification Team
Hello Nikolai,

Thank you for the report and feedback.

Thanks,
Umesh