Description:
Issue:
Based on testing, from mysql 8.0.27 onwards a table rebuild operation on a table which contains multiple secondary indexes can execute at least 4 times the IO to @@tmpdir (@@tmpdir/Innodb Merge Temp File) than that on previous versions. Initially I though this may be due to InnoDB DDL threads being set to 4 on MySQL 8027+, but even after setting to one, I still see the same behavior and IO amplification.
This can lead to a large increase in DDL execution times after upgrading from mysql versions 8026 and lower, to a higher version. In my example below a table rebuild went from 2 min 22 sec to 08 min 26 seconds on the same storage configuration.
Here are some numbers I have on 8026,8027,8036 & 8300.
Setup
* EC2 r6id.4xlarge
* Two test variants
* Test 1:
* Datavolume on local NVME
* temp on local NVME
* Test 2:
* Datavolume on local NVME
* temp on EBS storage(GP3 EBS, 120 GiB)
Based on the results you can see
* In general, comparing 8027+ to 8026:
* we can see an increase of ~230% in reads (GiB) to @@tmpdir/Innodb Merge Temp File
* we can see an increase of ~57% in writes (GiB) to @@tmpdir/Innodb Merge Temp File
* Test 1: With local NVME as @@tmpdir there is a increase in execution time (~20%) from 8027, but it's much less notable due to high nvme throughput. IO amplification is still there.
* Test 2: For tests where teh smaller EBS volume is used as @@tmpdir we can see a large increase in execution time(~230%+) from 8027 onwards, where 8026 is using the same storage.
* I also attempted to set @@innodb_ddl_threads to 1 to see if it would revert to the single threaded 8026 behavior, but no luck.
+--------------+------------+------------+-----------------+-----------------+-----------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | Data size in GB | DATA FREE in GB | IndexSize in GB |
+--------------+------------+------------+-----------------+-----------------+-----------------+
| test | test_table | 815,506 | 3.08 | 0.00 | 2.11 |
+--------------+------------+------------+-----------------+-----------------+-----------------+
1 row in set (0.00 sec)
+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+
| Version | innodb ddl threads | innodb ddl buffer size | rebuild time local NVME(mm:ss) | Difference in execution time vs 8026 - NVME | rebuild time EBS (mm:ss) | Difference in execution time vs 8026 - EBS | "@@tmpdir/Innodb Merge Temp File read" (GiB)) | Read Difference from 8026 | "@@tmpdir/Innodb Merge Temp File write" (GiB)) | Write Difference from 8026 |
+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+
| 8026 | N/A | N/A | 01:04.4 | N/A | 02:22.0 | NA | 14.6 | N/A | 14.6 | N/A |
| 8027 | 4 | 1048576 | 01:19.1 | 23% | 08:26.0 | 257% | 45.93 | 215% | 22.9 | 57% |
| 8036 | 4 | 1048576 | 01:20.0 | 24% | 07:45.0 | 228% | 49.15 | 237% | 22.9 | 57% |
| 8300 | 4 | 1048576 | 01:18.0 | 21% | 07:45.0 | 228% | 49.15 | 237% | 22.9 | 57% |
+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+
I also tested with innodb ddl threads=1, hoping this would fallback to pre-8027 behavior but no luck, you still see the same performance & IO amplification.
+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+
| Version | innodb ddl threads | innodb ddl buffer size | rebuild time local NVME(mm:ss) | Difference in execution time vs 8026 - NVME | rebuild time EBS (mm:ss) | Difference in execution time vs 8026 - EBS | "@@tmpdir/Innodb Merge Temp File read" (GiB)) | Read Difference from 8026 | "@@tmpdir/Innodb Merge Temp File write" (GiB)) | Write Difference from 8026 |
+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+
| 8027 | 1 | 1048576 | 02:04.0 | 93% | 08:43.0 | 269% | 45.93 | 215% | 22.9 | 57% |
| 8036 | 1 | 1048576 | 02:12.9 | 106% | 08:08.0 | 244% | 49.16 | 237% | 22.9 | 57% |
| 8300 | 1 | 1048576 | 02:11.8 | 105% | 08:07.0 | 243% | 49.15 | 237% | 22.9 | 57% |
+---------+--------------------+------------------------+--------------------------------+----------------------------------------------+--------------------------+--------------------------------------------+------------------------------------------------+---------------------------+-------------------------------------------------+----------------------------+
How to repeat:
For ease of repro I uploaded a smaller sample mysqldump dummy data. I also added multiple secondary indexes to the sample table to show the behavior. The IO increases should still be visible though, increase to 3 G data to see above numbers.
+--------------+------------+------------+-----------------+-----------------+-----------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | Data size in GB | DATA FREE in GB | IndexSize in GB |
+--------------+------------+------------+-----------------+-----------------+-----------------+
| test_db | test_table | 83,345 | 0.29 | 0.00 | 0.22 |
+--------------+------------+------------+-----------------+-----------------+-----------------+
1 row in set (0.00 sec)
Just load the mysqldump and run the following commands on 8026,8027,8036 & 8300:
truncate `performance_schema`.`file_summary_by_instance`;
alter table test_db.test_table engine innodb, algorithm=inplace;
select * from sys.io_global_by_file_by_bytes where count_read>0 or count_write>0;
Using the attached dump I seen these execution times and same % IO amplification on EC2 r6id.4xlarge where @@tmpdir was using GP3 EBS:
8026: 9.41 sec
8027: 43.82 sec
8036: 40.08 sec
8300: 40.08 sec
Suggested fix:
- Reduce/improve parallel ddl thread IO efficiency
- When DDL threads is set to 1, InnoDB should behave in the same way as versions < 8027. This is not the behavior I observed.
- Also worth noting that when innodb_ddl_threads was changed to 1, and innodb_ddl_buffer_size was default(1048576) an increase in avg Read IO size was not observed compared to innodb_ddl_threads=4 & innodb_ddl_buffer_size=4. Based on documentation, my understanding is innodb_ddl_buffer_size per thread will be innodb_ddl_buffer_size/innodb_ddl_threads. I was thinking this would match what 8026 uses but that isn't the case. avg read and write size in sys.io_global_by_file_by_bytes remained the same in 8027 for `@@tmpdir/Innodb Merge Temp File`, regardless of innodb_ddl_threads=4 or innodb_ddl_threads=1. This value should also be honored when DDL threads is set to 1 to mimic earlier behavior.