Description:
InnoDB stats calculation has a optimization. It does not scan a B+tree level if:
* innodb_stats_persistent_sample_pages > pages on the level
This will lead to a bug where for a low cardinality index innodb will chose a bad level for scanning. for example it could choose root as long as pages on level 1 greater than sample pages. This happens because for a low cardinality it will never fin 10 * sample pages of diff keys.
When doing the calculation with a bad level like root the low cardinality index will have big over estimative.
OBS: in steps to repeat it will be shown as a very low number of sample pages, for simplicity. The same could be achieve with a much larger sample page number with enough records
How to repeat:
-- 1. Create the table `t1` with specified columns and indexes
CREATE TABLE IF NOT EXISTS t1 (
id INT AUTO_INCREMENT,
sec_key INT,
data VARCHAR(100),
PRIMARY KEY (id, data),
INDEX (sec_key) -- Secondary index on `sec_key`
) ENGINE=InnoDB;
-- 2. Create the stored procedure to insert 200,000 dummy records
DELIMITER $$
CREATE PROCEDURE insert_dummy_records()
BEGIN
DECLARE iteration INT DEFAULT 0;
DECLARE data_val VARCHAR(100);
SET data_val = REPEAT('k', 100);
START TRANSACTION;
WHILE iteration < 20000 DO
INSERT INTO t1 (sec_key, data) VALUES
(1, data_val),
(2, data_val),
(3, data_val),
(4, data_val),
(5, data_val),
(6, data_val),
(7, data_val),
(8, data_val),
(9, data_val),
(10, data_val);
SET iteration = iteration + 1;
END WHILE;
COMMIT;
END$$
DELIMITER ;
-- 3. Call the stored procedure to execute the insertions
CALL insert_dummy_records();
CALL insert_dummy_records();
SET GLOBAL innodb_stats_persistent_sample_pages=8;
ANALYZE TABLE t1;
SELECT * FROM mysql.innodb_index_stats WHERE table_name='t1';
SET GLOBAL innodb_stats_persistent_sample_pages=40;
ANALYZE TABLE t1;
SELECT * FROM mysql.innodb_index_stats WHERE table_name='t1';
Suggested fix:
Review the stats algorithm. Maybe it could pick up on a few clues that it is running in a very bad scenario and fallback to a simpler safe expression.