Bug #117189 InnoDB index stats many orders of magnitude wrong
Submitted: 13 Jan 8:26 Modified: 13 Jan 8:38
Reporter: Matheus Aires Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0.39, 8.0.40 OS:Any
Assigned to: CPU Architecture:Any
Tags: Index stats, innodb

[13 Jan 8:26] Matheus Aires
Description:
InnoDB stats calculation has a optimization. It does not scan a B+tree level if:

 * innodb_stats_persistent_sample_pages > pages on the level

This will lead to a bug where for a low cardinality index innodb will chose a bad level for scanning. for example it could choose root as long as pages on level 1 greater than sample pages. This happens because for a low cardinality it will never fin 10 * sample pages of diff keys.

When doing the calculation with a bad level like root the low cardinality index will have big over estimative.

OBS: in steps to repeat it will be shown as a very low number of sample pages, for simplicity. The same could be achieve with a much larger sample page number with enough records

How to repeat:
-- 1. Create the table `t1` with specified columns and indexes
CREATE TABLE IF NOT EXISTS t1 (
    id INT AUTO_INCREMENT,
    sec_key INT,
    data VARCHAR(100),
    PRIMARY KEY (id, data),
    INDEX (sec_key) -- Secondary index on `sec_key`
) ENGINE=InnoDB;

-- 2. Create the stored procedure to insert 200,000 dummy records
DELIMITER $$

CREATE PROCEDURE insert_dummy_records()
BEGIN
    DECLARE iteration INT DEFAULT 0;
    DECLARE data_val VARCHAR(100);

    SET data_val = REPEAT('k', 100);

    START TRANSACTION;

    WHILE iteration < 20000 DO
        INSERT INTO t1 (sec_key, data) VALUES
            (1, data_val),
            (2, data_val),
            (3, data_val),
            (4, data_val),
            (5, data_val),
            (6, data_val),
            (7, data_val),
            (8, data_val),
            (9, data_val),
            (10, data_val);

        SET iteration = iteration + 1;
    END WHILE;

    COMMIT;
END$$

DELIMITER ;

-- 3. Call the stored procedure to execute the insertions
CALL insert_dummy_records();
CALL insert_dummy_records();

SET GLOBAL innodb_stats_persistent_sample_pages=8;

ANALYZE TABLE t1;

SELECT * FROM mysql.innodb_index_stats WHERE table_name='t1';

SET GLOBAL innodb_stats_persistent_sample_pages=40;

ANALYZE TABLE t1;

SELECT * FROM mysql.innodb_index_stats WHERE table_name='t1';

Suggested fix:
Review the stats algorithm. Maybe it could pick up on a few clues that it is running in a very bad scenario and fallback to a simpler safe expression.
[13 Jan 8:38] MySQL Verification Team
Hello Matheus Aires,

Thank you for the report and test case.

regards,
Umesh