Bug #114465 Regression in 8.0.27+ related to Parallel DDL IO Amplification
Submitted: 24 Mar 2024 6:32 Modified: 29 Aug 2024 22:44
Reporter: Marc Reilly Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: DDL Severity:S5 (Performance)
Version:8.0.36, 8.3.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: DDL, parallel create index, performance degradation, regression

[24 Mar 2024 6:32] Marc Reilly
Description:
Issue:
Based on testing, from mysql 8.0.27 onwards a table rebuild operation on a table which contains multiple secondary indexes can execute at least 4 times the IO to @@tmpdir (@@tmpdir/Innodb Merge Temp File) than that on previous versions. Initially I though this may be due to InnoDB DDL threads being set to 4 on MySQL 8027+, but even after setting to one, I still see the same behavior and IO amplification.

This can lead to a large increase in DDL execution times after upgrading from mysql versions 8026 and lower, to a higher version. In my example below a table rebuild went from 2 min 22 sec to 08 min 26 seconds on the same storage configuration. 

Here are some numbers I have on 8026,8027,8036 & 8300.

Setup
* EC2 r6id.4xlarge
* Two test variants
    * Test 1:
        * Datavolume on local NVME
        * temp on local NVME
    * Test 2:
        * Datavolume on local NVME
        * temp on EBS storage(GP3 EBS, 120 GiB)

Based on the results you can see 

* In general, comparing 8027+ to 8026:
    * we can see an increase of ~230% in reads (GiB) to @@tmpdir/Innodb Merge Temp File
    * we can see an increase of ~57% in writes (GiB) to @@tmpdir/Innodb Merge Temp File
* Test 1: With local NVME as @@tmpdir  there is a increase in execution time (~20%) from 8027, but it's much less notable due to high nvme throughput. IO amplification is still there.
* Test 2: For tests where teh smaller EBS volume is used as @@tmpdir we can see a large increase in execution time(~230%+) from 8027 onwards, where 8026 is using the same storage.
* I also attempted to set @@innodb_ddl_threads to 1 to see if it would revert to the single threaded 8026 behavior, but no luck. 

+--------------+------------+------------+-----------------+-----------------+-----------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | Data size in GB | DATA FREE in GB | IndexSize in GB |
+--------------+------------+------------+-----------------+-----------------+-----------------+
| test         | test_table | 815,506    | 3.08            | 0.00            | 2.11            |
+--------------+------------+------------+-----------------+-----------------+-----------------+
1 row in set (0.00 sec)

+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+
| Version | innodb ddl threads | innodb ddl buffer size | rebuild time local NVME(mm:ss) | Difference in execution time vs 8026 -  NVME | rebuild time EBS (mm:ss) | Difference in execution time vs 8026 - EBS | "@@tmpdir/Innodb Merge Temp File  read" (GiB)) | Read Difference from 8026 | "@@tmpdir/Innodb Merge Temp File  write" (GiB)) | Write Difference from 8026 |
+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+
|    8026 | N/A                | N/A                    | 01:04.4                        | N/A                                          | 02:22.0                  | NA                                         |                                           14.6 | N/A                       |                                            14.6 | N/A                        |
|    8027 | 4                  | 1048576                | 01:19.1                        | 23%                                          | 08:26.0                  | 257%                                       |                                          45.93 | 215%                      |                                            22.9 | 57%                        |
|    8036 | 4                  | 1048576                | 01:20.0                        | 24%                                          | 07:45.0                  | 228%                                       |                                          49.15 | 237%                      |                                            22.9 | 57%                        |
|    8300 | 4                  | 1048576                | 01:18.0                        | 21%                                          | 07:45.0                  | 228%                                       |                                          49.15 | 237%                      |                                            22.9 | 57%                        |
+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+

I also tested with innodb ddl threads=1, hoping this would fallback to pre-8027 behavior but no luck, you still see the same performance & IO amplification.

+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+
| Version | innodb ddl threads | innodb ddl buffer size | rebuild time local NVME(mm:ss) | Difference in execution time vs 8026 -  NVME | rebuild time EBS (mm:ss) | Difference in execution time vs 8026 - EBS | "@@tmpdir/Innodb Merge Temp File  read" (GiB)) | Read Difference from 8026 | "@@tmpdir/Innodb Merge Temp File  write" (GiB)) | Write Difference from 8026 |
+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+
|    8027 |                  1 |                1048576 | 02:04.0                        | 93%                                          | 08:43.0                  | 269%                                       |                                          45.93 | 215%                      |                                            22.9 | 57%                        |
|    8036 |                  1 |                1048576 | 02:12.9                        | 106%                                         | 08:08.0                  | 244%                                       |                                          49.16 | 237%                      |                                            22.9 | 57%                        |
|    8300 |                  1 |                1048576 | 02:11.8                        | 105%                                         | 08:07.0                  | 243%                                       |                                          49.15 | 237%                      |                                            22.9 | 57%                        |
+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+

How to repeat:
For ease of repro I uploaded a smaller sample mysqldump dummy data. I also added multiple secondary indexes to the sample table to show the behavior. The IO increases should still be visible though, increase to 3 G data to see above numbers.
+--------------+------------+------------+-----------------+-----------------+-----------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | Data size in GB | DATA FREE in GB | IndexSize in GB |
+--------------+------------+------------+-----------------+-----------------+-----------------+
| test_db      | test_table | 83,345     | 0.29            | 0.00            | 0.22            |
+--------------+------------+------------+-----------------+-----------------+-----------------+
1 row in set (0.00 sec)

Just load the mysqldump and run the following commands on 8026,8027,8036 & 8300:

truncate `performance_schema`.`file_summary_by_instance`;
alter table test_db.test_table engine innodb, algorithm=inplace;
select * from sys.io_global_by_file_by_bytes where count_read>0 or count_write>0;

Using the attached dump I seen these execution times and same % IO amplification on EC2 r6id.4xlarge where @@tmpdir was using GP3 EBS:
8026: 9.41 sec
8027: 43.82 sec
8036: 40.08 sec
8300: 40.08 sec

Suggested fix:
- Reduce/improve parallel ddl thread IO efficiency
- When DDL threads is set to 1, InnoDB should behave in the same way as versions < 8027. This is not the behavior I observed. 
- Also worth noting that when innodb_ddl_threads was changed to 1, and innodb_ddl_buffer_size was default(1048576) an increase in avg Read IO size was not observed compared to innodb_ddl_threads=4 & innodb_ddl_buffer_size=4. Based on documentation, my understanding is innodb_ddl_buffer_size per thread will be innodb_ddl_buffer_size/innodb_ddl_threads. I was thinking this would match what 8026 uses but that isn't the case. avg read and write size in sys.io_global_by_file_by_bytes remained the same in 8027 for `@@tmpdir/Innodb Merge Temp File`, regardless of innodb_ddl_threads=4 or innodb_ddl_threads=1.  This value should also be honored when DDL threads is set to 1 to mimic earlier behavior.
[24 Mar 2024 6:33] Marc Reilly
In verified bug report 114420 it was reported that there is a performance degradation for tables containing blob columns in MySQL 8. As part of that, a change was identified in 8.0.20 that may have led to this change in performance. After some further digging, this regression seems to have been introduced with the implementation of parallel DDL in mysql 8.0.27. Opening a separate bug report for ease of tracking.
[24 Mar 2024 6:37] Marc Reilly
mysqdump for repro

Attachment: dump_sql.zip (application/zip, text), 39.61 MiB.

[24 Mar 2024 6:44] Marc Reilly
In suggested fix
> Also worth noting that when innodb_ddl_threads was changed to 1, and innodb_ddl_buffer_size was default(1048576) an increase in avg Read IO size was not observed compared to innodb_ddl_threads=4 & innodb_ddl_buffer_size=4

Should be:
> Also worth noting that when innodb_ddl_threads was changed to 1, and innodb_ddl_buffer_size was default(1048576) an increase in avg Read IO size was not observed compared to innodb_ddl_threads=4 & innodb_ddl_buffer_size=1048576
[26 Mar 2024 7:29] MySQL Verification Team
Hello Marc Reilly,

Thank you for the report and test case.
Verified as described.

regards,
Umesh
[21 Jul 2024 8:47] Rahul Sisondia
Posted by developer:
 
Hello Marc from the other side, hope you doing great :) 

Thank you for the bug report and beautiful test case. 

Update: 

In this case excessive IOs happens because of overlapping reads by external merge sort during different passed while rebuilding the secondary indexes. That must not happen, we shall fix it.
[29 Aug 2024 3:53] Rahul Sisondia
Posted by developer:
 
Updates : 
The read leak during external merge sort will be fixed in upcoming release. It reduces the IO amplification a lot but not at par with 8.0.26. The reason for that is we also need to tune --innodb_ddl_threads and --innodb_ddl_buffer_size for apple to apple comparison with 8.0.26 .
Further improvements could be made through the subsequent patches in future.
[29 Aug 2024 4:30] Marc Reilly
Hey Rahul! Hope you’re well! Thanks for the updates.
[29 Aug 2024 22:44] Philip Olson
Posted by developer:
 
Fixed as of the upcoming MySQL Server 8.0.40, 8.4.3, and 9.1.0 releases, and here's the proposed changelog entry from the documentation team:

The table rebuild operation involving secondary indexes required about 4x
higher file I/O with the InnoDB temporary files when compared to MySQL
8.0.26, which degraded query performance.

Thank you Marc for the detailed bug report.