| Bug #115555 | Severe performance regression in filesort | ||
|---|---|---|---|
| Submitted: | 10 Jul 2024 10:31 | Modified: | 2 Sep 2024 9:37 |
| Reporter: | Huaxiong Song (OCA) | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
| Version: | 8.4, 8.0.39 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[10 Jul 2024 11:26]
MySQL Verification Team
Hi Mr. Song, Thank you for your bug report. We have managed to repeat the behaviour. However, there are already dozens of verified bugs which proves the deterioration of the filesort performance between 5.7 and current supported versions. We can not give you here all the bugs that are verified for the performance regression, but we shall give you only one: https://bugs.mysql.com/bug.php?id=112597 Here, we are following rules that are set for the comparisons between 5.7 and supported versions. We have verified your report, but marked it as a duplicate. We shall try to compare the performance between older 8.0 and latest. Also, we shall compare the performance between latest 8.0 and 8.4.1 and 9.0.0. If these comparisons show serious performance regressions, this report will be verified. Duplicate. Duplicate.
[10 Jul 2024 11:34]
MySQL Verification Team
HI, These are the timings for 8.0.25: First test --------- real 0m7.238s user 0m0.008s sys 0m0.008s Second test ------------ real 0m3.390s user 0m0.009s sys 0m0.008s These are the timings for 8.0.38: First test --------- real 0m7.377s user 0m0.009s sys 0m0.009s Second test ------------ real 0m3.434s user 0m0.009s sys 0m0.008s There are no significant differences. Next, we shall do the timings for 8.4.1.
[10 Jul 2024 12:37]
MySQL Verification Team
Hi, We made three runs with 8.4.1 ..... here are the results for the first SELECT ...... real 0m21.338s user 0m0.010s sys 0m0.009s real 0m17.635s user 0m0.009s sys 0m0.008s real 0m17.149s user 0m0.010s sys 0m0.018s That makes 8.4.1 two and half times slower then 8.0.38. Now, the second SELECT times: real 0m18.438s user 0m0.010s sys 0m0.016s real 0m19.003s user 0m0.009s sys 0m0.010s real 0m4.947s user 0m0.010s sys 0m0.009s These results are sufficient for this bug to be verified. However, we shall also run it on the 9.0.0.
[10 Jul 2024 13:08]
MySQL Verification Team
Hi, We also made four runs with 9.0.0 ..... here are the results for the first SELECT ...... real 0m6.363s user 0m0.010s sys 0m0.010s real 0m5.337s user 0m0.010s sys 0m0.011s real 0m2.161s user 0m0.009s sys 0m0.008s real 0m2.168s user 0m0.009s sys 0m0.010s Hence, for the first SELECT, 9.0.0 is just fine ...... Now, the second SELECT : real 0m4.021s user 0m0.009s sys 0m0.011s real 0m4.020s user 0m0.009s sys 0m0.007s real 0m4.004s user 0m0.010s sys 0m0.008s real 0m4.019s user 0m0.010s sys 0m0.008s Hence, in 9.0.0, first select if truly great ...... however second SELECT shows significant slowdown.
[10 Jul 2024 13:13]
MySQL Verification Team
HI Mr. Song, First of all, your report provides a lot of profiling info. That is very important. However, your test case proved that filesort in 8.4.1 is significantly slower then in 8.0.38, for both tests. Next, filesort in 9.0.0 is significantly slower for the second test. Hence, this bug is now a verified bug report for versions 8.4 and 9.0. Thank you for your contribution.
[26 Aug 2024 7:44]
MySQL Verification Team
Bug #115908 marked as duplicate of this one
[2 Sep 2024 0:49]
chong zhang
hi, I am currently troubled by the same problem. I know that one of the ways to avoid it afterwards is to create a new index to avoid full table scanning, but I would like to know if there is a good way to discover it in advance? There is a significant difference between the file sort source code of 8.0 and 5.7. Is it impossible to maintain consistency in file sort performance between 8.0 and 5.7 by adjusting parameters? I hope to receive your reply. Thank you
[2 Sep 2024 9:37]
Huaxiong Song
Hi, chong zhang Sorry, there seems to be no good solution at the moment. We can look forward to the community's solution. If you are using MySQL server compiled from source code, maybe you can add some logs、warnings or pick back the module related to the parameter max_length_for_sort_data
[16 Sep 2024 9:49]
MySQL Verification Team
Hi, Sorry, but right now , there is no good workaround nor any predictability. We simply will have to wait for a fix to be pushed into 8.0 and 8.4 trees. Nobody knows when will that happen.

Description: Compared with 5.6 and 5.7, the current 8.0 version of filesort has serious performance degradation. How to repeat: Consider the following, create table t1 (id int auto_increment primary key, col2 int, col3 char(255), col4 varchar(8192), paytime int unsigned DEFAULT NULL) ENGINE=INNODB; DELIMITER // CREATE PROCEDURE insert_data() BEGIN DECLARE i INT DEFAULT 0; WHILE i < 300000 DO INSERT INTO t1 VALUES (null, i, repeat('a', 255), repeat('a', 8192), FLOOR(RAND() * 300000)); SET i = i + 1; END WHILE; END // DELIMITER ; CALL insert_data(); #Then do a full-scan without index: # SQL1: select * from t1 order by paytime limit 1; # SQL2: select * from t1 order by col4 limit 1; My test result is: 5.6: SQL1 execution took 0.14s, SQL2 execution took 1.12s. 8.0: SQL1 execution took 1.2s, SQL2 execution took 6.4s. Suggested fix: I used perf and ebpf to analyze the problem. I think there are at least two problems here: 1. For SQL1, a more interesting problem occurred. I found from perf that the execution time of row_sel_store_mysql_rec increased significantly. Further, I used ebpf to track the execution of the function. Unfortunately, I found that btr_copy_externally_stored_field was only executed once in 56, while it was executed 300,000 times in 80! ``` #80-bpftrace: #bpftrace -e ' > uprobe:/home/songhuaxiong.shx/huaxiong.shx/code/mysql-community//mysql-8.0.38-linux-glibc2.12-x86_64/bin/mysqld:_ZN3lob37btr_copy_externally_stored_field_funcEP5trx_tPK12dict_index_tPmS5_PKhRK11page_size_tmP16mem_block_info_t { > @start[tid] = nsecs; > } > > uretprobe:/home/songhuaxiong.shx/huaxiong.shx/code/mysql-community//mysql-8.0.38-linux-glibc2.12-x86_64/bin/mysqld:_ZN3lob37btr_copy_externally_stored_field_funcEP5trx_tPK12dict_index_tPmS5_PKhRK11page_size_tmP16mem_block_info_t /@start[tid]/ { > @duration = hist(nsecs - @start[tid]); > delete(@start[tid]); > @count++; > } > > END { > printf("Function execution count: %d\n", @count); > print(@duration); > } > ' Attaching 3 probes... ^@^CFunction execution count: 300000 @duration: [1K, 2K) 93036 |@@@@@@@@@@@@@@@@@@@@@@@ | [2K, 4K) 205581 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4K, 8K) 677 | | [8K, 16K) 700 | | [16K, 32K) 6 | | #================================================================================================ #56-bpftrace: #bpftrace -e ' > uprobe:/home/songhuaxiong.shx/huaxiong.shx/code/mysql-community/mysql-5.6.51-linux-glibc2.12-x86_64/bin/mysqld:_Z32btr_copy_externally_stored_fieldPmPKhmmP16mem_block_info_t { > @start[tid] = nsecs; > } > > uretprobe:/home/songhuaxiong.shx/huaxiong.shx/code/mysql-community/mysql-5.6.51-linux-glibc2.12-x86_64/bin/mysqld:_Z32btr_copy_externally_stored_fieldPmPKhmmP16mem_block_info_t /@start[tid]/ { > @duration = hist(nsecs - @start[tid]); > delete(@start[tid]); > @count++; > } > > END { > printf("Function execution count: %d\n", @count); > print(@duration); > } > ' Attaching 3 probes... ^CFunction execution count: 1 @duration: [4K, 8K) 1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| ``` 2. For SQL2, I found from perf that the most efficient action (read) only accounted for 16.3%, while in 5.6, the CPU usage of read was as high as 61.7%. `alloc_and_make_key`, `write_key` and other operations Analysis ======== Further analysis of problem 1: s1. Before sorting, table->read_set stores all fields involved in this read s2. During sorting: MySQL versions 56 and 57 will modify read_set. Before entering row_sel_store_mysql_rec, read_set will be set to tmp_set (all 0), and then read_set will be set to the smallest bitmap to be accessed this time. Therefore, inside row_sel_store_mysql_rec, no data conversion will be performed on the blob field(call btr_copy_externally_stored_field). s3. After sorting is completed, versions 56 and 57 reset read_set to the value before sorting and start reading data. At this time, since read_set is all fields, a conversion process of the blob field occurs here. For version 80, the filesort logic and the processes before and after have been greatly changed. Setting tmp_set and resetting tmp_set no longer exist in the filesort of version 80. Therefore, whether in the filesort process or in the final read process, data conversion to the blob field will be triggered, resulting in a lot of time consumption. It is advisable to do similar optimizations in 80 to do `set` and `reset` of table->read_set.