Description:
Apparently contrary to the documentation:
https://dev.mysql.com/doc/refman/8.0/en/multiple-column-indexes.html
In some conditions, the ICP optimization is able to take advantage of compound index parts that are not the left-most prefix. Yet this advantage is not well reflected in the EXPLAIN estimates, nor the optimizer trace.
Also, ICP documentation doesn't seem to explain the ability to utilize such index part:
https://dev.mysql.com/doc/refman/8.0/en/index-condition-pushdown-optimization.html
I cannot find any place to actually explain the huge difference in execution stats for the two index use cases. The only difference in the EXPLAIN plan is "Using where" in extra info the index without the second column, and "index_condition" info in JSON format.
How to repeat:
CREATE TABLE `v1` (
`id` bigint NOT NULL,
`id_col1` int NOT NULL,
`col2` varchar(255) CHARACTER SET utf8mb3 COLLATE utf8mb3_bin NOT NULL,
`col3` varchar(500) CHARACTER SET utf8mb3 COLLATE utf8mb3_bin DEFAULT NULL,
`col4` datetime NOT NULL,
`col5` tinyint(1) NOT NULL DEFAULT '0',
`col6` tinyint(1) NOT NULL DEFAULT '1',
`col7` varchar(20) CHARACTER SET utf8mb3 COLLATE utf8mb3_bin NOT NULL,
`col8` varchar(20) CHARACTER SET utf8mb3 COLLATE utf8mb3_bin DEFAULT NULL,
`col9` tinyint(1) NOT NULL DEFAULT '0',
`col10` datetime DEFAULT NULL,
`col11` datetime DEFAULT NULL,
`col12` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`,`id_col1`),
UNIQUE KEY `uk` (`id_col1`,`col2`,`col7`) USING BTREE,
KEY `id_col1` (`id_col1`),
KEY `composite1` (`id_col1`,`col7`,`col5`,`col6`,`col9`,`col4`),
KEY `col10` (`col10`),
KEY `col11` (`col11`),
KEY `composite2` (`id_col1`,`col12`),
KEY `composite3` (`id_col1`,`col7`,`col8`,`col5`,`col9`,`col4`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_bin;
-- Load data dump from #110111.
mysql > explain select id, col2, col4 from v1 FORCE KEY(composite1) where id_col1 in (1178451,1001264,1045769,580,1038583,196) and col4 > now() - interval 48 HOUR\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: v1
partitions: NULL
type: range
possible_keys: composite1
key: composite1
key_len: 4
ref: NULL
rows: 50720
filtered: 33.33
Extra: Using index condition
1 row in set, 1 warning (0.00 sec)
mysql > explain select id, col2, col4 from v1 FORCE KEY(composite2) where id_col1 in (1178451,1001264,1045769,580,1038583,196) and col4 > now() - interval 48 HOUR\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: v1
partitions: NULL
type: range
possible_keys: composite2
key: composite2
key_len: 4
ref: NULL
rows: 52952
filtered: 33.33
Extra: Using index condition; Using where
1 row in set, 1 warning (0.00 sec)
mysql > explain format=json select id, col2, col4 from v1 FORCE KEY(composite1) where id_col1 in (1178451,1001264,1045769,580,1038583,196) and col4 > now() - interval 48 HOUR\G
*************************** 1. row ***************************
EXPLAIN: {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "60870.01"
},
"table": {
"table_name": "v1",
"access_type": "range",
"possible_keys": [
"composite1"
],
"key": "composite1",
"used_key_parts": [
"id_col1"
],
"key_length": "4",
"rows_examined_per_scan": 50720,
"rows_produced_per_join": 16904,
"filtered": "33.33",
"index_condition": "((`db1`.`v1`.`id_col1` in (1178451,1001264,1045769,580,1038583,196)) and (`db1`.`v1`.`col4` > <cache>((now() - interval 48 hour))))",
"cost_info": {
"read_cost": "55798.01",
"eval_cost": "1690.50",
"prefix_cost": "60870.01",
"data_read_per_join": "39M"
},
"used_columns": [
"id",
"id_col1",
"col2",
"col4"
]
}
}
}
1 row in set, 1 warning (0.00 sec)
mysql > explain format=json select id, col2, col4 from v1 FORCE KEY(composite2) where id_col1 in (1178451,1001264,1045769,580,1038583,196) and col4 > now() - interval 48 HOUR\G
*************************** 1. row ***************************
EXPLAIN: {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "63548.41"
},
"table": {
"table_name": "v1",
"access_type": "range",
"possible_keys": [
"composite2"
],
"key": "composite2",
"used_key_parts": [
"id_col1"
],
"key_length": "4",
"rows_examined_per_scan": 52952,
"rows_produced_per_join": 17648,
"filtered": "33.33",
"index_condition": "(`db1`.`v1`.`id_col1` in (1178451,1001264,1045769,580,1038583,196))",
"cost_info": {
"read_cost": "58253.21",
"eval_cost": "1764.89",
"prefix_cost": "63548.41",
"data_read_per_join": "40M"
},
"used_columns": [
"id",
"id_col1",
"col2",
"col4"
],
"attached_condition": "(`db1`.`v1`.`col4` > <cache>((now() - interval 48 hour)))"
}
}
}
1 row in set, 1 warning (0.00 sec)
As seen above, the estimated rows and costs are very similar, and used_key_parts show only the first column used, while clearly in the composite1 index case, col4 is used as well.
Despite the above, execution stats are very different:
composite1:
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 2 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 6 |
| Handler_read_last | 0 |
| Handler_read_next | 50 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-------+
composite2:
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 2 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 6 |
| Handler_read_last | 0 |
| Handler_read_next | 24937 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-------+
Suggested fix:
Fix the visibility in the EXPLAIN, and perhaps improve the documentation as well.