Description:
rechecking_index_usage causes the materialized table to be generated using table scan in Materialize semijoin strategy.
How to repeat:
1. Create table
CREATE TABLE t1 (
id int PRIMARY KEY,
col1 int NOT NULL
);
CREATE TABLE t2 (
id int PRIMARY KEY,
col1 int NOT NULL,
col2 int,
KEY index_col1 (col1),
KEY index_col2 (col2)
);
2. Create procedure
DELIMITER //
CREATE PROCEDURE insert_data ()
BEGIN
DECLARE num int;
SET num = 1;
WHILE num <= 1000 DO
INSERT INTO t1
VALUES (num, num);
INSERT INTO t2
VALUES (num, num, num);
SET num = num + 1;
END WHILE;
END;//
DELIMITER ;
3. Call procedure to insert data
call insert_data();
4. Execute query
EXPLAIN SELECT count(*) FROM t1 WHERE t1.col1 not in (SELECT col1 FROM t2 force index(index_col2) WHERE t2.col2 in (1,2));
+----+--------------+-------------+------------+--------+---------------------+---------------------+---------+-------------+------+----------+-------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+------------+--------+---------------------+---------------------+---------+-------------+------+----------+-------------------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | NULL |
| 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 5 | bug.t1.col1 | 1 | 100.00 | Using where; Not exists |
| 2 | MATERIALIZED | t2 | NULL | ALL | index_col2 | NULL | NULL | NULL | 1000 | 100.00 | Using where |
+----+--------------+-------------+------------+--------+---------------------+---------------------+---------+-------------+------+----------+-------------------------+
It would be better to use range scan on index_col2 of table t to materialize table <subquery2>, but table scan was used instead. If we check optimizer_trace, we will see that the access type of t2 is changed from range to table_scan due to rechecking_index_usage.
"rechecking_index_usage": {
"recheck_reason": "not_first_table",
"range_analysis": {
"table_scan": {
"rows": 1000,
"cost": 1.79769e+308
},
"potential_range_indexes": [
{
"index": "PRIMARY",
"usable": false,
"cause": "not_applicable"
},
{
"index": "index_col1",
"usable": false,
"cause": "not_applicable"
},
{
"index": "index_col2",
"usable": true,
"key_parts": [
"col2",
"id"
]
}
],
"setup_range_conditions": [
],
"range_scan_possible": false,
"cause": "condition_always_true",
"group_index_range": {
"chosen": false,
"cause": "not_single_table"
},
"skip_scan_range": {
"chosen": false,
"cause": "not_single_table"
}
}
}
And the reason is 'condition_always_true'.
Suggested fix:
The rechecking_index_usage should not be applied to the base table used to generate the materialized table.