Bug #116738 The DDL performance of 8.0.40 has not been restored to 8.0.26
Submitted: 21 Nov 2024 4:09 Modified: 29 Nov 2024 10:07
Reporter: Xizhe Zhang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DDL Severity:S5 (Performance)
Version:8.0.40 OS:Any
Assigned to: CPU Architecture:Any

[21 Nov 2024 4:09] Xizhe Zhang
Description:
Since 8.0.27 introduced "Parallel Threads for Online DDL", the performance of DDL has dropped significantly. In the Release Notes of 8.0.40, I saw a fix for this problem (Bug #114465), which was attributed to "secondary indexes required greater file I/O". 

After my test, I think this problem has not been completely solved, and the single-threaded DDL performance of 8.0.40(about 130s) is much worse than that of 8.0.26(about 74s). The performance of single-threaded DDL should not vary so much.

How to repeat:
-- 1. I constructed a table with 30 million rows (you can reduce the size for easy reproduction). 
-- Then I dumped the table and loaded it into 8.0.26, 8.0.27, 8.0.39, and 8.0.40 for testing.
CREATE TABLE employees (
    emp_no      INT             AUTO_INCREMENT NOT NULL,
    salary      INT             NOT NULL,
    birth_date  DATE            NOT NULL,
    first_name  VARCHAR(14)     CHARACTER SET utf8mb3 COLLATE utf8mb3_general_ci NOT NULL,
    last_name   VARCHAR(16)     CHARACTER SET utf8mb3 COLLATE utf8mb3_general_ci NOT NULL,
    gender      ENUM ('M','F')  NOT NULL,
    hire_date   DATE            NOT NULL,
    PRIMARY KEY (emp_no),
    KEY idx_salary (salary)
) ENGINE=InnoDB;

DROP PROCEDURE IF EXISTS foo;
DELIMITER |
CREATE PROCEDURE foo(num int)
BEGIN
    DECLARE i int;
    SET i=1;
    WHILE(i<=num)DO
        INSERT INTO `employees` VALUES (
            DEFAULT,
            ROUND((RAND() * (100000-1000)) + 1000),
            FROM_UNIXTIME(UNIX_TIMESTAMP('1949-10-1') + FLOOR(RAND() * (UNIX_TIMESTAMP('1999-10-1') - UNIX_TIMESTAMP('1949-10-1')))),
            LEFT(MD5(RAND()), 8),
            LEFT(MD5(RAND()), 7),
            'M',
            FROM_UNIXTIME(UNIX_TIMESTAMP('1998-10-1') + FLOOR(RAND() * (UNIX_TIMESTAMP('2023-10-1') - UNIX_TIMESTAMP('1998-10-1'))))
        );
        SET i=i+1;
    END WHILE;
END|
DELIMITER ;

CALL foo(30000000);

UPDATE employees SET gender='F' WHERE salary%2 = 0;

-- 2. Test multiple times on 8.0.26, with innodb_sort_buffer_size=1048576
-- Executes about 74s
optimize table employees;
2 rows in set (1 min 14.27 sec)

-- 3. Test multiple times on 8.0.27, with innodb_ddl_threads=1 and innodb_ddl_buffer_size=1048576
-- Executes about 116s
optimize table employees;
2 rows in set (1 min 56.14 sec)

-- 4. Test multiple times on 8.0.39, with innodb_ddl_threads=1 and innodb_ddl_buffer_size=1048576
-- Executes about 138s
optimize table employees;
2 rows in set (2 min 17.87 sec)

-- 5. Test multiple times on 8.0.40, with innodb_ddl_threads=1 and innodb_ddl_buffer_size=1048576
-- Executes about 130s
optimize table employees;
2 rows in set (2 min 10.53 sec)

Suggested fix:
I'm still investigating the cause, and I think the goal should be to get the performance of single-threaded DDL back to 8.0.26.
[29 Nov 2024 10:07] MySQL Verification Team
Hello Xizhe Zhang,

Thank you for the report and feedback.

regards,
Umesh