Bug #118130 8.0.27 DDL performance degradation - Merge Sort
Submitted: 8 May 6:53 Modified: 8 May 9:59
Reporter: Xizhe Zhang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DDL Severity:S5 (Performance)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution

[8 May 6:53] Xizhe Zhang
Description:
In Bug #116738, we can see that the DDL Merge Sort performance has dropped dramatically from 8.0.26 to 8.0.27. This is due to two problems, which will be introduced below.

How to repeat:
See Bug #116738
[8 May 7:11] Xizhe Zhang
Problem 1

First, I counted the execution of the functions involved in Merge Sort, as follows.

8.0.26:
                                    mean              std                min                max       count                sum
row_merge_sort        24554152355.000000              nan 24554152355.000000 24554152355.000000    1.000000 24554152355.000000
row_merge_blocks         95226574.070039 273924986.619489    20217757.000000  2640968785.000000  257.000000 24473229536.000000
row_merge_blocks_copy    11334450.000000    700006.703376    10337125.000000    12571828.000000    7.000000    79341150.000000
row_merge_read            1408105.183721   1112539.814653      447231.000000     7680906.000000 2580.000000  3632911374.000000
row_merge_write           1852294.642248    608159.797740     1128682.000000     8147844.000000 2580.000000  4778920177.000000

8.0.27:
                                mean              std                min                max       count                sum
merge_sort        52048373099.000000              nan 52048373099.000000 52048373099.000000    1.000000 52048373099.000000
merge_file_sort   52048353103.000000              nan 52048353103.000000 52048353103.000000    1.000000 52048353103.000000
merge_ranges       4337162140.083333 560345568.487641  3893581215.000000  5875358550.000000   12.000000 52045945681.000000
merge_rows           18277410.654639 123709954.153901     1742197.000000  3935598989.000000 2522.000000 46095629671.000000
sort_output_flush      482123.342982    203069.568053      167700.000000     2411052.000000 2522.000000  1215915071.000000
cursor_seek           1856282.398493    836687.419125        3582.000000     8016953.000000 2522.000000  4681544209.000000

Shockingly, the 'ddl::Merge_file_sort::merge_rows' function in 8.0.27 was executed 2522 times! The corresponding 'row_merge_blocks' function in 8.0.26 was only executed 257 times.
After analyzing the code, it was found that from 8.0.27, 'innodb_ddl_buffer_size' was used to control the memory usage during the index building phase. The default value is 1M. When single-threaded, 1/3M can be used for each two-way merge. 8.0.26 uses 'innodb_sort_buffer_size' to control the memory usage during the index building phase. The default value is 1M. 1M can be used for each two-way merge.
Therefore, setting 'innodb_ddl_buffer_size' to 3M is aligned with 8.0.26. After further testing, the merge sorting time is reduced to 41.1s, and the number of executions (949 times) is still different from 8.0.26:
                                mean              std                min                max      count                sum
merge_sort        41156292078.000000              nan 41156292078.000000 41156292078.000000   1.000000 41156292078.000000
merge_file_sort   41156274886.000000              nan 41156274886.000000 41156274886.000000   1.000000 41156274886.000000
merge_ranges       4115168554.000000 617491025.414948  3659972194.000000  5682814030.000000  10.000000 41151685540.000000
merge_rows           38252526.298209 166485813.130711      342740.000000  3656018739.000000 949.000000 36301647457.000000
sort_output_flush      823596.953635    365384.952719       73590.000000     3669332.000000 949.000000   781593509.000000
cursor_seek           4233328.063224   1725303.021257        3401.000000    32353698.000000 949.000000  4017428332.000000

After further analysis, it was found that 8.0.27 generated more blocks (258 vs 945) during the scanning PK phase:
8.0.26:
                                             mean            std                min                max      count                sum
row_merge_read_clustered_index 42920960786.000000            nan 42920960786.000000 42920960786.000000   1.000000 42920960786.000000
row_merge_buf_write                7914613.356589 2751710.996052     3265175.000000    27974125.000000 258.000000  2041970246.000000

8.0.27:
                                   mean           std                min                max      count                sum
ddl_scan             53150428753.000000           nan 53150428753.000000 53150428753.000000   1.000000 53150428753.000000
key_buffer_serialize     2901898.973545 950308.589808      642938.000000    11855355.000000 945.000000  2742294530.000000

Debugging found:
-> The maximum buffer size of 8.0.26 is 131072 records, but the actual buffer size is 116508 records, which is almost full (1048572B/1048576B)
-> The maximum buffer size of 8.0.27 is 131072 records, but the actual buffer size is 31775 records, which is not full (285975B/1048576B)

The reason for this result is that 8.0.27 takes into account the memory occupied by the 'dfield_t' object used in each row. 'sizeof(dfield_t)' is 24 B, plus the Record itself 8+1=9 B, each Record is calculated as 33B, so the number of Records is 1048576 / 33 = 31775.

'''
  bool will_fit(size_t n) const noexcept {
    return m_total_size + n +
               (sizeof(std::remove_pointer<decltype(
                           m_dtuples)::value_type>::type) *
                (m_n_tuples + 1)) <=
           m_buffer_size - 1;
  }
'''

Then I modifid the 'will_fit' function so that it no longer considers the memory occupied by the 'dfield_t' object. Test again and I find that only 258 blocks are generated in the scanning phase, which is consistent with 8.0.26. The time for Merge Sort has also dropped to 34.2s.
[8 May 7:22] Xizhe Zhang
Hello, Verification Team, this is my patch based on 8.0.42 for Problem 1.

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: ddl_merge_sort_problem_1.diff (application/octet-stream, text), 2.05 KiB.

[8 May 7:34] Xizhe Zhang
Problem 2

After fixing Problem 1, there is still a gap between 8.0.27 and 8.0.26 (34.2s vs 24.4s). After multiple time analysis of different code segments, it was found that the slowness of 8.0.27 lies in the merge sort itself. Although only two-way merge is required, 8.0.27 introduces 'std::priority_queue' as the minimum heap in 'Merge_cursor' to complete the merge, which brings unnecessary overhead (queue entry, queue exit, etc.). Multi-way merge can only occur when multiple threads are used in the Scan PK phase, then the multi-way merge is required when Build the B+ tree later.
I changed the 'merge_rows' function of 8.0.27 to a two-way merge, and the execution time of 'ddl::Merge_file_sort::merge_rows' was reduced to 21.8s, and the total execution time of the Merge Sort phase was reduced to 22.8s, which is consistent with 8.0.26.
[8 May 8:11] Xizhe Zhang
Hello, this is my patch for Problem 2. I found that 8.0.42 made some changes to the 'File_reader', so it is based on 8.0.27.

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: ddl_merge_sort_problem_2.diff (application/octet-stream, text), 4.89 KiB.

[8 May 9:59] MySQL Verification Team
Hello Xizhe Zhang ,

Thank you for the report and contribution.

regards,
Umesh