Bug #114812 in list values exceed eq_range_index_dive_limit cause index not used
Submitted: 29 Apr 2024 9:55 Modified: 29 Apr 2024 10:18
Reporter: jia liu Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0.36,5.7.44 OS:Any
Assigned to: CPU Architecture:Any
Tags: eq_range_index_dive_limit

[29 Apr 2024 9:55] jia liu
Description:
when more values in an in list than eq_range_index_dive_limit, the query execution plan seems ignore index unconditionally.
which will cause performance problem.

How to repeat:
CREATE TABLE `large` (
  `id` bigint unsigned NOT NULL,
  `data` varchar(100) COLLATE utf8mb4_general_ci DEFAULT NULL,
  KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

insert into large values (1,rand()),(2,rand()),(3,rand()),(4,rand()),(5,rand()),(6,rand()),(7,rand()),(8,rand()),(9,rand()),(10,rand());

insert into large select 100,rand() from large;   -- repeat 10 times to create some data in table.

analyze table large;

set session eq_range_index_dive_limit=200;

explain select sum(data) from large where id in (1,2,3,4,5,6,7,8,9,10) and id between 1 and 10;

the result is :
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | large | NULL       | range | id            | id   | 8       | NULL |   10 |   100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
which is fine.

set session eq_range_index_dive_limit=5;   -- make it less than the numbers in the in list

explain select sum(data) from large where id in (1,2,3,4,5,6,7,8,9,10) and id between 1 and 10;

the result is :
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | large | NULL       | ALL  | id            | NULL | NULL    | NULL | 9853 |    83.32 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
which shows a full table scan.

test it:
set session long_query_time=0;
select sum(data) from large where id in (1,2,3,4,5,6,7,8,9,10) and id between 1 and 10;

slowlog confirms that sql scans a lot rows:
# Time: 2024-04-29T17:43:21.446726+08:00
# User@Host: root[root] @ localhost []  Id:    11
# Query_time: 0.004150  Lock_time: 0.000002 Rows_sent: 1  Rows_examined: 10242
SET timestamp=1714383801;
select sum(data) from large where id in (1,2,3,4,5,6,7,8,9,10) and id between 1 and 10;

without in list, the sql will use the "between and" as an index condition to filter data.
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | large | NULL       | range | id            | id   | 8       | NULL |   10 |   100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+

# Time: 2024-04-29T17:47:10.633367+08:00
# User@Host: root[root] @ localhost []  Id:    11
# Query_time: 0.000265  Lock_time: 0.000003 Rows_sent: 1  Rows_examined: 10
SET timestamp=1714384030;
select sum(data) from large where id between 1 and 10;

Suggested fix:
when eq_range_index_dive_limit exceeded, there is an chance that other conditions can take effect, index should not ignored unconditionally.
[29 Apr 2024 10:14] jia liu
also affects 5.7.44
[29 Apr 2024 10:18] MySQL Verification Team
Hello!

Thank you for the report.

regards,
Umesh