Bug #104083 MySQL optimizer chooses the wrong index due to 'range_uses_more_keyparts'
Submitted: 22 Jun 2021 8:09 Modified: 22 Jun 2021 13:08
Reporter: Jaime Sicam Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0.25 OS:Any
Assigned to: CPU Architecture:Any

[22 Jun 2021 8:09] Jaime Sicam
Description:
Given this table:

CREATE TABLE test_table(a int not null auto_increment PRIMARY KEY, b binary(32), c datetime, d int, KEY i_bc(b,c), KEY i_bd(b,d));

And this query: SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00'

The optimizer chooses index i_bd instead of i_bc even if i_bc is faster:

mysql> EXPLAIN SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
+----+-------------+------------+------------+------+---------------+------+---------+-------+-------+----------+------------------------------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref   | rows  | filtered | Extra                              |
+----+-------------+------------+------------+------+---------------+------+---------+-------+-------+----------+------------------------------------+
|  1 | SIMPLE      | test_table | NULL       | ref  | i_bc,i_bd     | i_bd | 33      | const | 15198 |    33.33 | Using index condition; Using where |
+----+-------------+------------+------------+------+---------------+------+---------+-------+-------+----------+------------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> SET PROFILING=1;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> pager md5sum
PAGER set to 'md5sum'
mysql> SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.04 sec)

mysql> SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.03 sec)

mysql> SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.05 sec)

mysql> SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.01 sec)

mysql> SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.01 sec)

mysql> SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';
769350e28abd440843dc3784a9903ac6  -
1 row in set (0.00 sec)

mysql> pager
Default pager wasn't set, using stdout.
mysql> SHOW PROFILES;
+----------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                                                                                                                       |
+----------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
|        1 | 0.03816100 | SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00'                  |
|        2 | 0.02243650 | SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00'                  |
|        3 | 0.04616250 | SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00'                  |
|        4 | 0.01307825 | SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00' |
|        5 | 0.01710225 | SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00' |
|        6 | 0.00338125 | SELECT COUNT(*) FROM test_table FORCE INDEX(i_bc) WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00' |
+----------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
6 rows in set, 1 warning (0.00 sec)

How to repeat:
Populate Table:
INSERT INTO test_table(b,c) VALUES(HEX(UUID_TO_BIN(UUID())),FROM_UNIXTIME(2000000000*RAND()));
INSERT INTO test_table(b,c) SELECT HEX(UUID_TO_BIN(UUID())), FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT HEX(UUID_TO_BIN(UUID())), FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT HEX(UUID_TO_BIN(UUID())), FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT HEX(UUID_TO_BIN(UUID())), FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;
INSERT INTO test_table(b,c) SELECT b, FROM_UNIXTIME(2000000000*RAND()) FROM test_table;

Get second record:
mysql> SELECT b FROM test_table LIMIT 1,1;
+--------------------------------------------------------------------+
| b                                                                  |
+--------------------------------------------------------------------+
| 0x3637443634424146443332443131454238364241343832414533303134333531 |
+--------------------------------------------------------------------+
1 row in set (0.00 sec)

Generate the query from result:
EXPLAIN SELECT COUNT(*) FROM test_table  WHERE b=x'3637443844434636443332443131454238364241343832414533303134333531' AND c < '2019-01-01 00:00:00';

A snippet of the optimizer trace looks like this:
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "table": "`test_table`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "access_type": "ref",
                      "index": "i_bc",
                      "chosen": false,
                      "cause": "range_uses_more_keyparts"
                    },
                    {
                      "access_type": "ref",
                      "index": "i_bd",
                      "rows": 15198,
                      "cost": 1928.55,
                      "chosen": true
                    },
                    {
                      "rows_to_scan": 10384,
                      "access_type": "range",
                      "range_details": {
                        "used_index": "i_bc"
                      },
                      "resulting_rows": 10384,
                      "cost": 2090.65,
                      "chosen": false
                    }
                  ]
                },
                "condition_filtering_pct": 100,
                "rows_for_plan": 15198,
                "cost_for_plan": 1928.55,
                "chosen": true
              }
            ]

With production data it takes minutes to execute compared to seconds when it chooses the wrong index. This is also not repeatable in 5.7.
[22 Jun 2021 13:08] MySQL Verification Team
Hello Jaime,

Thank you for the report and test case.
Verified as described.

regards,
Umesh
[22 Jun 2021 13:09] MySQL Verification Team
MySQL Server 8.0.25 test results

Attachment: 104083_8.0.25.results (application/octet-stream, text), 16.57 KiB.

[26 Jul 12:21] Boy Zhang
I think the reason for the wrong execution plan here is not 'range_uses_more_keyparts', because even if ref is not choosen, the cost of range scan for index i_bc will be better than ref scan of i_bd, and i_bc should be choosen in the end. The fundamental reason here is that when considering the cost of the filter effect, since the condition of the range scan fully covers the where condition, the final_filter_effect is 1, which causes the result of scan_total_cost = scan_read_cost + row_evaluate_cost(prefix_rowcount * rows_after_filtering) to be overestimated, resulting in the final cost becomes higher.
[22 Sep 3:56] Yoshinori Matsunobu
There are two problems here.

1. If "range_uses_more_keyparts" kicks in, "ref" plan for the same index is excluded from plan candidates.

MySQL optimizer tends to overestimate "range" plan cost, as discussed at #2, so there are certain chances that "range" cost is estimated higher than "ref" from other indexes. This was what happened in the bug report -- "range" from "i_bc" was estimated higher than "ref" from "i_bd". The biggest problem was "ref" from "i_bc" was excluded as a plan candidate because of the following logic.

https://github.com/mysql/mysql-server/blob/8.0/sql/sql_planner.cc#L585-L586

            if (!table_deps && table->quick_keys.is_set(key) &&  
                table->quick_key_parts[key] > cur_used_keyparts) 
            {
              trace_access_idx.add("chosen", false)
                  .add_alnum("cause", "range_uses_more_keyparts");
              continue;
            }

I believe the plan has to be re-evaluated, because like this case, "ref" for the same index can be more efficient. I think at least adding an optimizer_switch flag to change behavior (not to disable the plan here) would be helpful.

2. Range scan cost is estimated way higher than ref plan

There are multiple places that cause differences, but my understanding is notably:

- Ref plan caps io cost -- "worst seeks": It is either the number of total (estimated) rows of the table / 10, or 3x of the scan_time(), whichever is lower. This is defined in find_worst_seeks() and both numbers are hard coded. Range plan does not cap io cost like that.

- Range plan adds additional cost estimates at Optimize_table_order::best_access_path().
    const double scan_total_cost =
        scan_read_cost +
        cost_model->row_evaluate_cost(prefix_rowcount * rows_after_filtering);
 and scan_read_cost is
        prefix_rowcount * (tab->range_scan()->cost +
                           cost_model->row_evaluate_cost(
                               tab->found_records - *rows_after_filtering));

 So it adds about 10% of the tab->found_records on top off the range scan cost estimates.
 Note that range_scan()->cost already includes 10% of the total estimated rows, at handler::multi_range_read_info_const(). So it's double counting, which looks over estimating to me.

So #2 causes optimizer to pick range scan less likely, and due to #1, if range scan is not chosen, the index is not chosen as ref plan either.