Bug #116738 | The DDL performance of 8.0.40 has not been restored to 8.0.26 | ||
---|---|---|---|
Submitted: | 21 Nov 2024 4:09 | Modified: | 8 May 10:23 |
Reporter: | Xizhe Zhang (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: DDL | Severity: | S5 (Performance) |
Version: | 8.0.40 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[21 Nov 2024 4:09]
Xizhe Zhang
[29 Nov 2024 10:07]
MySQL Verification Team
Hello Xizhe Zhang, Thank you for the report and feedback. regards, Umesh
[7 May 7:38]
MySQL Verification Team
Hello Xizhe Zhang, During our internal verification, we observed discrepancy in our test results when compared with that of provided in the report i.e. From your report: Time taken in 8.0.26: 74sec Time taken in 8.0.27: 116sec (56% loss compared to 8.0.26) Time taken in 8.0.40: 130sec (75% loss compared to 8.0.26) What we are seeing internally: Time taken in 8.0.26: 121.49sec Time taken in 8.0.27: 162.91sec (34% loss compared to 8.0.26) Time taken in 8.0.40: 151.74sec (25% loss compared to 8.0.26) We would like to request you to share below details from your verified environment to investigate further: - If is using custom builds such as source build then need exact cmake options used for the build - OS details including Memory, CPU information and file system details on which MySQL Server files residing. - MySQL Server configuration file in use( my.cnf or my.ini or <custom>.cnf whichever is in use) - Any custom InnoDB settings, especially buffer pool size, parallel read threads, and DDL buffer size Thank you. regards, Umesh
[8 May 6:37]
Xizhe Zhang
Hello, Verification Team! The performance degradation ratio may be related to the quality of the CPU. I will get different ratios when running the test on a better machine too. I conducted a detailed analysis of this issue some time ago. The degradation ratio is not the key to this issue. The key is why there is such a significant degradation. By comparing the performance of 8.0.26 and 8.0.27, I found multiple problems that caused performance degradation. In order to avoid making this bug report too lengthy, I will make a summary here and provide analysis and fix patch for each independent problem. Let me first list a few key options: For 8.0.27: innodb_buffer_pool_size=64G # Enough to hold the entire table innodb_parallel_read_threads=1 # Single-threaded scan clustered index innodb_ddl_threads=1 # Single-threaded Sort and Build Indexes innodb_ddl_buffer_size=1M # Default Value For 8.0.26: innodb_buffer_pool_size=64G # Enough to hold the entire table innodb_sort_buffer_size=1M # Default Value I wrote a small tool based on Perf Probe to count the time and number of times a function is executed. The following is the statistical result of a DDL execution process: # 8.0.26: mean std min max count sum mysql_alter_table 77966049794.000000 nan 77966049794.000000 77966049794.000000 1.000000 77966049794.000000 row_merge_build_indexes 77944214368.000000 nan 77944214368.000000 77944214368.000000 1.000000 77944214368.000000 row_merge_read_clustered_index 42416340096.000000 nan 42416340096.000000 42416340096.000000 1.000000 42416340096.000000 row_merge_sort 24878481077.000000 nan 24878481077.000000 24878481077.000000 1.000000 24878481077.000000 row_merge_insert_index_tuples 7489639265.000000 nan 7489639265.000000 7489639265.000000 1.000000 7489639265.000000 row_merge_file_destroy 51454.250000 96429.121519 2851.000000 196097.000000 4.000000 205817.000000 row_merge_write_redo 5659.000000 3203.193719 3394.000000 7924.000000 2.000000 11318.000000 flush_observer 3158910412.000000 nan 3158910412.000000 3158910412.000000 1.000000 3158910412.000000 # 8.0.27: mean std min max count sum mysql_alter_table 121428803147.000000 nan 121428803147.000000 121428803147.000000 1.000000 121428803147.000000 scan_and_build_indexes 121404012955.000000 nan 121404012955.000000 121404012955.000000 1.000000 121404012955.000000 ddl_scan 51680853370.000000 nan 51680853370.000000 51680853370.000000 1.000000 51680853370.000000 ddl_load 69723129061.000000 nan 69723129061.000000 69723129061.000000 1.000000 69723129061.000000 create_merge_sort_tasks 9888.000000 nan 9888.000000 9888.000000 1.000000 9888.000000 merge_sort 51910233630.000000 nan 51910233630.000000 51910233630.000000 1.000000 51910233630.000000 btree_build 11547696552.000000 nan 11547696552.000000 11547696552.000000 1.000000 11547696552.000000 build_finish 3237135506.500000 29179669.957877 3216502364.000000 3257768649.000000 2.000000 6474271013.000000 The above results show that performance degradation occurred in multiple stages of the DDL execution process. I will provide analysis results in different bug reports and link to this bug report. +----------------+--------------------------------+----------------------------+ | | 8.0.26 | 8.0.27 | +----------------+--------------------------------+----------------------------+ | Scan PK | row_merge_read_clustered_index | ddl::Parallel_cursor::scan | | | 42.4s | 51.7s | +----------------+--------------------------------+----------------------------+ | Merge Sort SK | row_merge_sort | ddl::Builder::merge_sort | | | 24.9s | 52s | +----------------+--------------------------------+----------------------------+ | Build SK | row_merge_insert_index_tuples | ddl::Builder::btree_build | | | 7.5s | 11.5s | +----------------+--------------------------------+----------------------------+ | Flush Observer | FlushObserver::flush | ddl::Builder::finish | | | 3.16s | 6.4s | +----------------+--------------------------------+----------------------------+ | Overall | 78s | 121.4s | +----------------+--------------------------------+----------------------------+
[8 May 9:48]
MySQL Verification Team
Hello Xizhe Zhang, Thank you for the details, will share this with development and get back to you if anything further needed. regards, Umesh
[8 May 10:01]
MySQL Verification Team
Related - Bug #118130, Bug #118133
[8 May 10:23]
Xizhe Zhang
Hello, Verification Team, Thanks for your reply! I have created 3 new Bug Reports, which introduce my analysis process. I will briefly explain it here: -> Analysis for 'Merge Sort' (Bug #118130) -> Analysis for 'Scan PK' (Bug #118133) -> 'Build SK' will benefit from the solution of Problem 2 in Bug #118133. The remaining gap I think comes from the use of 'std::priority_queue' (mentioned in Bug #118130). Since the performance degradation caused by this part is relatively small (less than 1s), I did not submit another bug report. -> Analysis for 'Flush Observer' (Bug #118136)