| Bug #96958 | MySQL optimizer wrongly chooses to use a bigger and inefficient index for query | ||
|---|---|---|---|
| Submitted: | 20 Sep 2019 21:42 | Modified: | 22 Sep 2019 14:23 |
| Reporter: | Leonardo Fernandes | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 8.0 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[20 Sep 2019 21:44]
Leonardo Fernandes
optimizer_trace
Attachment: optimizer_trace (application/octet-stream, text), 15.58 KiB.
[21 Sep 2019 17:18]
MySQL Verification Team
Attaching the optimizer trace output and a dump of the table. Trace Ok, dump ?.
[22 Sep 2019 1:26]
MySQL Verification Team
d:\dbs>d:\dbs\8.0\bin\mysql -uroot --port=3580 -p --prompt="mysql 8.0 > " --default-character-set=utf8mb4 Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 36 Server version: 8.0.19 Source distribution BUILT: 2019-SEP-20 Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql 8.0 > USE test Database changed mysql 8.0 > select count(*) from test; +----------+ | count(*) | +----------+ | 2000000 | +----------+ 1 row in set (1.21 sec) mysql 8.0 > explain SELECT col1 FROM test WHERE col1 = 50 AND id > 666666 ORDER BY id ASC LIMIT 1000; +----+-------------+-------+------------+------+------------------------------------------------------------------------------------------------+-------------------------+---------+-------+-------+----------+------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+------------------------------------------------------------------------------------------------+-------------------------+---------+-------+-------+----------+------------------------------------------+ | 1 | SIMPLE | test | NULL | ref | PRIMARY,index_test_on_col2_col1,test_extended_index,index_test_on_col1,index_test_on_col1_col2 | index_test_on_col1_col2 | 4 | const | 68608 | 50.00 | Using where; Using index; Using filesort | +----+-------------+-------+------------+------+------------------------------------------------------------------------------------------------+-------------------------+---------+-------+-------+----------+------------------------------------------+ 1 row in set, 1 warning (0.02 sec) mysql 8.0 > explain SELECT col1 FROM test USE INDEX (index_test_on_col1) WHERE col1 = 50 AND id > 666666 ORDER BY id ASC LIMIT 1000; +----+-------------+-------+------------+-------+--------------------+--------------------+---------+------+-------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+--------------------+--------------------+---------+------+-------+----------+--------------------------+ | 1 | SIMPLE | test | NULL | range | index_test_on_col1 | index_test_on_col1 | 8 | NULL | 35028 | 100.00 | Using where; Using index | +----+-------------+-------+------------+-------+--------------------+--------------------+---------+------+-------+----------+--------------------------+ 1 row in set, 1 warning (0.50 sec)
[22 Sep 2019 3:21]
MySQL Verification Team
Which 5.7 version you tested?; d:\dbs>d:\dbs\5.7\bin\mysql -uroot --port=3570 -p --prompt="mysql 5.7 > " Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 6 Server version: 5.7.29-log Source distribution BUILT: 2019-SEP-20 Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql 5.7 > use test Database changed mysql 5.7 > explain SELECT col1 FROM test WHERE col1 = 50 AND id > 666666 ORDER BY id ASC LIMIT 1000; +----+-------------+-------+------------+-------+------------------------------------------------------------------------+--------------------+---------+------+-------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+------------------------------------------------------------------------+--------------------+---------+------+-------+----------+--------------------------+ | 1 | SIMPLE | test | NULL | range | PRIMARY,test_extended_index,index_test_on_col1,index_test_on_col1_col2 | index_test_on_col1 | 8 | NULL | 35028 | 100.00 | Using where; Using index | +----+-------------+-------+------------+-------+------------------------------------------------------------------------+--------------------+---------+------+-------+----------+--------------------------+ 1 row in set, 1 warning (0.01 sec) mysql 5.7 >
[22 Sep 2019 14:23]
MySQL Verification Team
Thank you for the bug report. With most recent source only repeatable with version 8.0, so removing 5.7 from version reported.
[30 Apr 2024 9:09]
WANG GUANGYOU
Do we must insert 2 million rows to reproduce this?
Even one rows, it should use the index_test_on_col1. However, I populate 22000 rows into the table. It can not use the best index even in 5.7.26-29-log.
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 22011
Server version: 5.7.26-29-log Source distribution
Copyright (c) 2009-2023 Percona LLC and/or its affiliates
Copyright (c) 2000, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> explain SELECT col1 FROM test WHERE col1 = 50 AND id > 10 ORDER BY id ASC LIMIT 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: test
partitions: NULL
type: ref
possible_keys: PRIMARY,test_extended_index,index_test_on_col1,index_test_on_col1_col2
key: index_test_on_col1_col2
key_len: 4
ref: const
rows: 20
filtered: 50.00
Extra: Using where; Using index; Using filesort
1 row in set, 1 warning (0.00 sec)
mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
| 22000 |
+----------+
1 row in set (0.00 sec)

Description: MySQL optimizer wrongly chooses to use a larger index for query even when a smaller and more efficient is available. Table: master [localhost:26022] {msandbox} (test) > show create table test \G *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `id` int(11) NOT NULL AUTO_INCREMENT, `col1` int(11) NOT NULL, `col2` bigint(20) NOT NULL, `created_at` timestamp NOT NULL, PRIMARY KEY (`id`), KEY `index_test_on_col2` (`col2`), KEY `index_test_on_created_at` (`created_at`), KEY `index_test_on_col2_col1` (`col2`,`col1`), KEY `test_extended_index` (`col1`,`id`), KEY `index_test_on_col1` (`col1`), KEY `index_test_on_col1_col2` (`col1`,`col2`) ) ENGINE=InnoDB AUTO_INCREMENT=2000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci STATS_SAMPLE_PAGES=65535 1 row in set (0.00 sec) Inserted 2 million random rows: master [localhost:26022] {msandbox} (test) > select count(*) from test; +----------+ | count(*) | +----------+ | 2000000 | +----------+ 1 row in set (0.03 sec) Query: SELECT col1 FROM test WHERE col1 = 50 AND id > 666666 ORDER BY id ASC LIMIT 1000; Explain: master [localhost:26022] {msandbox} (test) > explain SELECT col1 FROM test WHERE col1 = 50 AND id > 666666 ORDER BY id ASC LIMIT 1000; +----+-------------+-------+------------+------+------------------------------------------------------------------------------------------------+-------------------------+---------+-------+-------+----------+------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+------------------------------------------------------------------------------------------------+-------------------------+---------+-------+-------+----------+------------------------------------------+ | 1 | SIMPLE | test | NULL | ref | PRIMARY,index_test_on_col2_col1,test_extended_index,index_test_on_col1,index_test_on_col1_col2 | index_test_on_col1_col2 | 4 | const | 77572 | 50.00 | Using where; Using index; Using filesort | +----+-------------+-------+------------+------+------------------------------------------------------------------------------------------------+-------------------------+---------+-------+-------+----------+------------------------------------------+ 1 row in set, 1 warning (0.00 sec) Notice that the optimizer chooses index_test_on_col1_col2 instead of index_test_on_col1, even if col2 is not used by the query and only the col1 part of the index is used (key_len=4). When forcing the index index_test_on_col1, the plan changes from type=ref to type=range, it doesn't use the filesort anymore, the query is around 10x faster, and both the field col1 and id from the index are used (key_len=8 vs key_len=4 when using index_test_on_col1_col2). master [localhost:26022] {msandbox} (test) > explain SELECT col1 FROM test USE INDEX (index_test_on_col1) WHERE col1 = 50 AND id > 666666 ORDER BY id ASC LIMIT 1000; +----+-------------+-------+------------+-------+--------------------+--------------------+---------+------+-------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+--------------------+--------------------+---------+------+-------+----------+--------------------------+ | 1 | SIMPLE | test | NULL | range | index_test_on_col1 | index_test_on_col1 | 8 | NULL | 51126 | 100.00 | Using where; Using index | +----+-------------+-------+------------+-------+--------------------+--------------------+---------+------+-------+----------+--------------------------+ 1 row in set, 1 warning (0.00 sec) When looking at the optimizer for the query without the hint, we can see: Evaluating the index index_test_on_col1 for range: "analyzing_range_alternatives": { "range_scan_alternatives": [ [...] { "index": "index_test_on_col1", "ranges": [ "50 <= col1 <= 50 AND 666666 < id" ], "index_dives_for_eq_ranges": true, "rowid_ordered": true, "using_mrr": false, "index_only": true, "rows": 51126, "cost": 5163.5, "chosen": true } We can see that index_test_on_col1 is chosen and it costs 5163.5. Now, when evaluating the indexes for access_type=ref, the optimizer discards index_test_on_col1 because it considers the range access better, as it uses more keyparts ("range_uses_more_keyparts"), but it evaluates index_test_on_col1_col2 and chooses it, even though the cost is bigger (7909.4). { "access_type": "ref", "index": "index_test_on_col1", "chosen": false, "cause": "range_uses_more_keyparts" }, { "access_type": "ref", "index": "index_test_on_col1_col2", "rows": 77572, "cost": 7909.4, "chosen": true } Attaching the optimizer trace output and a dump of the table. How to repeat: Create the table: CREATE TABLE `test` ( `id` int(11) NOT NULL AUTO_INCREMENT, `col1` int(11) NOT NULL, `col2` bigint(20) NOT NULL, `created_at` timestamp NOT NULL, PRIMARY KEY (`id`), KEY `index_test_on_col2` (`col2`), KEY `index_test_on_created_at` (`created_at`), KEY `index_test_on_col2_col1` (`col2`,`col1`), KEY `test_extended_index` (`col1`,`id`), KEY `index_test_on_col1` (`col1`), KEY `index_test_on_col1_col2` (`col1`,`col2`) ) ENGINE=InnoDB Import the dump attached. Run the query: explain SELECT col1 FROM test WHERE col1 = 50 AND id > 666666 ORDER BY id ASC LIMIT 1000; compare with: explain SELECT col1 FROM test USE INDEX (index_test_on_col1) WHERE col1 = 50 AND id > 666666 ORDER BY id ASC LIMIT 1000;