Bug #119770 MySQL 8.4 optimizer uses full table scan for IN (...) on prefix index, but works in 8.0
Submitted: 24 Jan 19:09 Modified: 27 Jan 10:03
Reporter: Aftab Khan Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.4 OS:Any
Assigned to: CPU Architecture:Any
Tags: execution-plan, in-predicate, index-usage, innodb, mysql-8.4, Optimizer, performance, prefix-index, query-planner, regression, upgrade-regression

[24 Jan 19:09] Aftab Khan
Description:
In MySQL 8.4, the optimizer does not use a prefix (partial) index when the query predicate uses IN (...), resulting in a full table scan.
The same schema, data, and query uses the prefix index correctly in MySQL 8.0.

This appears to be an optimizer regression in MySQL 8.4 affecting queries with IN (...) predicates on prefix indexes.

Observed Behaviour (MySQL 8.4.5)

Given a table with a prefix index:

CREATE TABLE t_prefix_in (
  id BIGINT PRIMARY KEY AUTO_INCREMENT,
  uniq_name VARCHAR(64) NOT NULL,
  payload VARCHAR(100),
  KEY idx_uniq_name (uniq_name(10))
) ENGINE=InnoDB;

For a query using IN (...):

The optimize chooses a full table scan, even when explicitly hinting the index:

mysql> explain select uniq_name, payload from t_prefix_in where uniq_name in ('cc5b7c9ff91011f0903442010ab84135','cc5b75b4f91011f0903442010ab84135');
+----+-------------+-------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table       | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+-------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | t_prefix_in | NULL       | ALL  | idx_uniq_name | NULL | NULL    | NULL | 124396 |    20.00 | Using where |
+----+-------------+-------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

Expected Behavior

The optimizer should use the prefix index (idx_uniq_name) to evaluate the IN (...) predicate, similar to MySQL 8.0 behavior, resulting in an indexed lookup or range scan.

Comparison with MySQL 8.0.41

Using the same table definition, data volume, and query in MySQL 8.0:

mysql> explain select uniq_name, payload from t_prefix_in where uniq_name in ('cc5b7c9ff91011f0903442010ab84135','cc5b75b4f91011f0903442010ab84135');
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+-------------+
| id | select_type | table       | partitions | type  | possible_keys | key           | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t_prefix_in | NULL       | range | idx_uniq_name | idx_uniq_name | 32      | NULL |    2 |   100.00 | Using where |
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.01 sec)

Additional Observation. A single-value predicate in MySQL 8.4 does use the index correctly:

mysql> explain select uniq_name, payload from t_prefix_in where uniq_name in ('cc5b7c9ff91011f0903442010ab84135');
+----+-------------+-------------+------------+------+---------------+---------------+---------+-------+------+----------+-------------+
| id | select_type | table       | partitions | type | possible_keys | key           | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------------+------------+------+---------------+---------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | t_prefix_in | NULL       | ref  | idx_uniq_name | idx_uniq_name | 32      | const |   10 |   100.00 | Using where |
+----+-------------+-------------+------------+------+---------------+---------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.01 sec)

How to repeat:
Server: Percona Server for MySQL 8.4.5-5

Step 1: Create test database and table

DROP DATABASE IF EXISTS test_prefix_idx;
CREATE DATABASE test_prefix_idx;
USE test_prefix_idx;

CREATE TABLE t_prefix_in (
  id BIGINT PRIMARY KEY AUTO_INCREMENT,
  uniq_name VARCHAR(64) NOT NULL,
  payload VARCHAR(100),
  KEY idx_uniq_name (uniq_name(10))
) ENGINE=InnoDB;

Step 2: Populate table with non-unique values

INSERT INTO t_prefix_in (uniq_name, payload)
SELECT
  REPLACE(UUID(), '-', ''),
  'dummy'
FROM
  information_schema.tables t1,
  information_schema.tables t2
LIMIT 300000;

Step 3: Verify table statistics

ANALYZE TABLE t_prefix_in;
SHOW INDEX FROM t_prefix_in WHERE Key_name = 'idx_uniq_name';

Step 4: Run query with IN (...) predicate e.g.

SELECT uniq_name INTO @value1
FROM t_prefix_in
LIMIT 1;

SELECT uniq_name INTO @value2
FROM t_prefix_in
LIMIT 1 OFFSET 1;

EXPLAIN
SELECT uniq_name, payload
FROM t_prefix_in
WHERE uniq_name IN (@value1, @value2);

Observe result:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t_prefix_in
   partitions: NULL
         type: ALL
possible_keys: idx_uniq_name
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 119629
     filtered: 20.00
        Extra: Using where
1 row in set, 1 warning (0.01 sec)

Step 5: Confirm index hint does not help

mysql> EXPLAIN SELECT uniq_name, payload FROM t_prefix_in FORCE INDEX (idx_uniq_name)  WHERE uniq_name IN (@value1,@value2) \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t_prefix_in
   partitions: NULL
         type: ALL
possible_keys: idx_uniq_name
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 119629
     filtered: 20.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

Step 6: Control test – single value predicate

EXPLAIN SELECT uniq_name, payload FROM t_prefix_in WHERE uniq_name = @value1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t_prefix_in
   partitions: NULL
         type: ref
possible_keys: idx_uniq_name
          key: idx_uniq_name
      key_len: 32
          ref: const
         rows: 10
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

Result:

type = ref
key = idx_uniq_name
Index is used correctly

Step 7: Comparison with MySQL 8.0 (expected behaviour)

Repeat the same steps on MySQL 8.0.41 using identical schema and data.

Expected result:

mysql> EXPLAIN SELECT uniq_name, payload FROM t_prefix_in WHERE uniq_name IN (@value1, @value2)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t_prefix_in
   partitions: NULL
         type: range
possible_keys: idx_uniq_name
          key: idx_uniq_name
      key_len: 32
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

type = range
key = idx_uniq_name

Efficient indexed lookup for IN (...)

Suggested fix:
The optimizer should consider prefix indexes for IN (...) predicates when:

the indexed prefix length fully covers the compared constant values, and

the same index is already chosen for equivalent = predicates.

This behavior works correctly in MySQL 8.0.x but appears to regress in MySQL 8.4.x, where the optimizer consistently chooses a full table scan for IN (...) even when FORCE INDEX is specified.

As a temporary workaround, using full-length indexes (instead of prefix indexes) restores correct index usage in MySQL 8.4.

mysql> alter table t_prefix_in drop index idx_uniq_name, add index idx_uniq_name(uniq_name);
Query OK, 0 rows affected (0.76 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> EXPLAIN
SELECT uniq_name, payload
FROM t_prefix_in
WHERE uniq_name IN (@value1, @value2);
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+-----------------------+
| id | select_type | table       | partitions | type  | possible_keys | key           | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | t_prefix_in | NULL       | range | idx_uniq_name | idx_uniq_name | 194     | NULL |    2 |   100.00 | Using index condition |
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.03 sec)

mysql> EXPLAIN SELECT uniq_name, payload FROM t_prefix_in WHERE uniq_name IN (@value1, @value2)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t_prefix_in
   partitions: NULL
         type: range
possible_keys: idx_uniq_name
          key: idx_uniq_name
      key_len: 194
          ref: NULL
         rows: 2
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)
[24 Jan 19:15] Aftab Khan
Additional findings (Optimizer Trace):

Optimizer trace shows that although the prefix index idx_uniq_name is considered usable, no range conditions are constructed for IN (...) predicates.

setup_range_conditions is empty

Both range access and skip scan are rejected (cause: disjuntive_predicate_present)

As a result, the optimizer only considers a full table scan, even when FORCE INDEX is specified

This behavior differs from MySQL 8.0, where the same schema and query generate a range access on the prefix index.

Commands used:

SET optimizer_trace="enabled=on";
EXPLAIN SELECT uniq_name, payload
FROM t_prefix_in
WHERE uniq_name IN (@value1, @value2)\G

SELECT * FROM information_schema.OPTIMIZER_TRACE\G
[25 Jan 12:24] Aftab Khan
I can reproduce the same behaviour on MySQL 8.4.8 (Homebrew) using a clean test case.

With a prefix index KEY idx_uniq_name (uniq_name(10)), queries using an IN (...) predicate do not use the index.

CREATE TABLE t_prefix_in (
  id BIGINT PRIMARY KEY AUTO_INCREMENT,
  uniq_name VARCHAR(64) NOT NULL,
  payload VARCHAR(100),
  KEY idx_uniq_name (uniq_name(10))
) ENGINE=InnoDB;

INSERT INTO t_prefix_in (uniq_name, payload)
SELECT REPLACE(UUID(), '-', ''), 'dummy'
FROM information_schema.tables t1,
     information_schema.tables t2
LIMIT 100000;

ANALYZE TABLE t_prefix_in;

SELECT uniq_name INTO @value1 FROM t_prefix_in LIMIT 1;
SELECT uniq_name INTO @value2 FROM t_prefix_in LIMIT 1 OFFSET 1;

EXPLAIN
SELECT uniq_name, payload
FROM t_prefix_in
WHERE uniq_name IN (@value1, @value2);

Actual behavior:

- type = ALL
- Full table scan
- Prefix index is not used, even though it is listed as a possible key

Expected behaviour:

- type = range
- Prefix index should be used, as observed with the same schema and query in MySQL 8.0
[27 Jan 10:03] Knut Anders Hatlen
Thanks for the bug report.

This seems to be a variant of bug#118009. Bug#118009 uses a full index, not a prefix index, so they are not identical, but they have in common that the string literals in the IN list are longer than the indexed part of the column. Both bugs were introduced by the same bug fix in MySQL 8.4.0. Closing as duplicate.