Bug #104083 MySQL optimizer chooses the wrong index due to 'range_uses_more_keyparts'
Submitted: 22 Jun 8:09 Modified: 22 Jun 13:08
Reporter: Jaime Sicam Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0.25 OS:Any
Assigned to: CPU Architecture:Any

[22 Jun 8:09] Jaime Sicam
Description:
Given this table:

CREATE TABLE test_table(a int not null auto_increment PRIMARY KEY, b binary(32), c datetime, d int, KEY i_bc(b,c), KEY i_bd(b,d));

And this query: SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00'

The optimizer chooses index i_bd instead of i_bc even if i_bc is faster:

mysql> EXPLAIN SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
+----+-------------+------------+------------+------+---------------+------+---------+-------+-------+----------+------------------------------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref   | rows  | filtered | Extra                              |
+----+-------------+------------+------------+------+---------------+------+---------+-------+-------+----------+------------------------------------+
|  1 | SIMPLE      | test_table | NULL       | ref  | i_bc,i_bd     | i_bd | 33      | const | 15198 |    33.33 | Using index condition; Using where |
+----+-------------+------------+------------+------+---------------+------+---------+-------+-------+----------+------------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> SET PROFILING=1;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> pager md5sum
PAGER set to 'md5sum'
mysql> SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.04 sec)

mysql> SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.03 sec)

mysql> SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.05 sec)

mysql> SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.01 sec)

mysql> SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.01 sec)

mysql> SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.00 sec)

mysql> pager
Default pager wasn't set, using stdout.
mysql> SHOW PROFILES;
+----------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                                                                                                                       |
+----------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
|        1 | 0.03816100 | SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00'                  |
|        2 | 0.02243650 | SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00'                  |
|        3 | 0.04616250 | SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00'                  |
|        4 | 0.01307825 | SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00' |
|        5 | 0.01710225 | SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00' |
|        6 | 0.00338125 | SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00' |
+----------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
6 rows in set, 1 warning (0.00 sec)

How to repeat:
Populate Table:
INSERT INTO test_table(b,c) VALUES(HEX(UUID_TO_BIN(UUID())),FROM_UNIXTIME(2000000000*RAND()));
INSERT INTO test_table(b,c) SELECT HEX(UUID_TO_BIN(UUID())), FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT HEX(UUID_TO_BIN(UUID())), FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT HEX(UUID_TO_BIN(UUID())), FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT HEX(UUID_TO_BIN(UUID())), FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;

Get second record:
mysql> SELECT b FROM test_table LIMIT 1,1;
+--------------------------------------------------------------------+
| b                                                                  |
+--------------------------------------------------------------------+
| 0x3637443634424146443332443131454238364241343832414533303134333531 |
+--------------------------------------------------------------------+
1 row in set (0.00 sec)

Generate the query from result:
EXPLAIN SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';

A snippet of the optimizer trace looks like this:
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "table": "`test_table`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "access_type": "ref",
                      "index": "i_bc",
                      "chosen": false,
                      "cause": "range_uses_more_keyparts"
                    },
                    {
                      "access_type": "ref",
                      "index": "i_bd",
                      "rows": 15198,
                      "cost": 1928.55,
                      "chosen": true
                    },
                    {
                      "rows_to_scan": 10384,
                      "access_type": "range",
                      "range_details": {
                        "used_index": "i_bc"
                      },
                      "resulting_rows": 10384,
                      "cost": 2090.65,
                      "chosen": false
                    }
                  ]
                },
                "condition_filtering_pct": 100,
                "rows_for_plan": 15198,
                "cost_for_plan": 1928.55,
                "chosen": true
              }
            ]

With production data it takes minutes to execute compared to seconds when it chooses the wrong index. This is also not repeatable in 5.7.
[22 Jun 13:08] MySQL Verification Team
Hello Jaime,

Thank you for the report and test case.
Verified as described.

regards,
Umesh
[22 Jun 13:09] MySQL Verification Team
MySQL Server 8.0.25 test results

Attachment: 104083_8.0.25.results (application/octet-stream, text), 16.57 KiB.