Bug #115555 Severe performance regression in filesort
Submitted: 10 Jul 2024 10:31 Modified: 2 Sep 2024 9:37
Reporter: Huaxiong Song (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.4, 8.0.39 OS:Any
Assigned to: CPU Architecture:Any

[10 Jul 2024 10:31] Huaxiong Song
Description:
Compared with 5.6 and 5.7, the current 8.0 version of filesort has serious performance degradation.

How to repeat:
Consider the following,

create table t1 (id int auto_increment primary key, col2 int, col3 char(255), col4 varchar(8192), paytime int unsigned DEFAULT NULL) ENGINE=INNODB;

DELIMITER //
CREATE PROCEDURE insert_data()
BEGIN
    DECLARE i INT DEFAULT 0;

    WHILE i < 300000 DO
        INSERT INTO t1 VALUES (null, i, repeat('a', 255), repeat('a', 8192), FLOOR(RAND() * 300000));
        SET i = i + 1;
    END WHILE;
END //

DELIMITER ;

CALL insert_data();

#Then do a full-scan without index:

# SQL1: 
select * from t1 order by paytime limit 1;

# SQL2:
select * from t1 order by col4 limit 1;

My test result is:

5.6: SQL1 execution took 0.14s, SQL2 execution took 1.12s.
8.0: SQL1 execution took 1.2s, SQL2 execution took 6.4s.

Suggested fix:
I used perf and ebpf to analyze the problem.

I think there are at least two problems here:

1. For SQL1, a more interesting problem occurred. I found from perf that the execution time of row_sel_store_mysql_rec increased significantly. Further, I used ebpf to track the execution of the function. Unfortunately, I found that btr_copy_externally_stored_field was only executed once in 56, while it was executed 300,000 times in 80!

```
#80-bpftrace:
#bpftrace -e '
>   uprobe:/home/songhuaxiong.shx/huaxiong.shx/code/mysql-community//mysql-8.0.38-linux-glibc2.12-x86_64/bin/mysqld:_ZN3lob37btr_copy_externally_stored_field_funcEP5trx_tPK12dict_index_tPmS5_PKhRK11page_size_tmP16mem_block_info_t {
>     @start[tid] = nsecs;
>   }
>
>   uretprobe:/home/songhuaxiong.shx/huaxiong.shx/code/mysql-community//mysql-8.0.38-linux-glibc2.12-x86_64/bin/mysqld:_ZN3lob37btr_copy_externally_stored_field_funcEP5trx_tPK12dict_index_tPmS5_PKhRK11page_size_tmP16mem_block_info_t /@start[tid]/ {
>     @duration = hist(nsecs - @start[tid]);
>     delete(@start[tid]);
>     @count++;
>   }
>
>   END {
>     printf("Function execution count: %d\n", @count);
>     print(@duration);
>   }
> '
Attaching 3 probes...
^@^CFunction execution count: 300000
@duration:
[1K, 2K)           93036 |@@@@@@@@@@@@@@@@@@@@@@@                             |
[2K, 4K)          205581 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[4K, 8K)             677 |                                                    |
[8K, 16K)            700 |                                                    |
[16K, 32K)             6 |                                                    |

#================================================================================================

#56-bpftrace:
#bpftrace -e '
>   uprobe:/home/songhuaxiong.shx/huaxiong.shx/code/mysql-community/mysql-5.6.51-linux-glibc2.12-x86_64/bin/mysqld:_Z32btr_copy_externally_stored_fieldPmPKhmmP16mem_block_info_t {
>     @start[tid] = nsecs;
>   }
>
>   uretprobe:/home/songhuaxiong.shx/huaxiong.shx/code/mysql-community/mysql-5.6.51-linux-glibc2.12-x86_64/bin/mysqld:_Z32btr_copy_externally_stored_fieldPmPKhmmP16mem_block_info_t /@start[tid]/ {
>     @duration = hist(nsecs - @start[tid]);
>     delete(@start[tid]);
>     @count++;
>   }
>
>   END {
>     printf("Function execution count: %d\n", @count);
>     print(@duration);
>   }
> '
Attaching 3 probes...
^CFunction execution count: 1
@duration:
[4K, 8K)               1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
```

2. For SQL2, I found from perf that the most efficient action (read) only accounted for 16.3%, while in 5.6, the CPU usage of read was as high as 61.7%. `alloc_and_make_key`, `write_key` and other operations 

Analysis
========
Further analysis of problem 1:
s1. Before sorting, table->read_set stores all fields involved in this read
s2. During sorting: MySQL versions 56 and 57 will modify read_set. Before entering row_sel_store_mysql_rec, read_set will be set to tmp_set (all 0), and then read_set will be set to the smallest bitmap to be accessed this time. Therefore, inside row_sel_store_mysql_rec, no data conversion will be performed on the blob field(call btr_copy_externally_stored_field).
s3. After sorting is completed, versions 56 and 57 reset read_set to the value before sorting and start reading data. At this time, since read_set is all fields, a conversion process of the blob field occurs here.

For version 80, the filesort logic and the processes before and after have been greatly changed. Setting tmp_set and resetting tmp_set no longer exist in the filesort of version 80. Therefore, whether in the filesort process or in the final read process, data conversion to the blob field will be triggered, resulting in a lot of time consumption.

It is advisable to do similar optimizations  in 80 to do `set` and `reset` of table->read_set.
[10 Jul 2024 11:26] MySQL Verification Team
Hi Mr. Song,

Thank you for your bug report.

We have managed to repeat the behaviour.

However, there are already  dozens of verified bugs which proves the deterioration of the filesort performance between 5.7 and current supported versions.

We can not give you here all the bugs that are verified for the performance regression, but we shall give you only one:

https://bugs.mysql.com/bug.php?id=112597

Here, we are following rules that are set for the comparisons between 5.7 and supported versions.

We have verified your report, but marked it as a duplicate.

We shall try to compare the performance between older 8.0 and latest. Also, we shall compare the performance between latest 8.0 and 8.4.1 and 9.0.0.

If these comparisons show serious performance regressions, this report will be verified.

Duplicate.

Duplicate.
[10 Jul 2024 11:34] MySQL Verification Team
HI,

These are the timings for 8.0.25:

First test
---------
real	0m7.238s
user	0m0.008s
sys	0m0.008s

Second test
------------

real	0m3.390s
user	0m0.009s
sys	0m0.008s

These are the timings for 8.0.38:

First test
---------

real	0m7.377s
user	0m0.009s
sys	0m0.009s

Second test
------------

real	0m3.434s
user	0m0.009s
sys	0m0.008s

There are no significant differences.

Next, we shall do the timings for 8.4.1.
[10 Jul 2024 12:37] MySQL Verification Team
Hi,

We made three runs with 8.4.1 ..... here are the results for the first SELECT ......

real	0m21.338s
user	0m0.010s
sys	0m0.009s

real	0m17.635s
user	0m0.009s
sys	0m0.008s

real	0m17.149s
user	0m0.010s
sys	0m0.018s

That makes 8.4.1 two and half times slower then 8.0.38.

Now, the second SELECT times:

real	0m18.438s
user	0m0.010s
sys	0m0.016s

real	0m19.003s
user	0m0.009s
sys	0m0.010s

real	0m4.947s
user	0m0.010s
sys	0m0.009s

These results are sufficient for this bug to be verified.

However, we shall also run it on the 9.0.0.
[10 Jul 2024 13:08] MySQL Verification Team
Hi,

We also  made four runs with 9.0.0  ..... here are the results for the first SELECT ......

real	0m6.363s
user	0m0.010s
sys	0m0.010s

real	0m5.337s
user	0m0.010s
sys	0m0.011s

real	0m2.161s
user	0m0.009s
sys	0m0.008s

real	0m2.168s
user	0m0.009s
sys	0m0.010s

Hence, for the first SELECT, 9.0.0 is just fine ......

Now, the second SELECT :

real	0m4.021s
user	0m0.009s
sys	0m0.011s

real	0m4.020s
user	0m0.009s
sys	0m0.007s

real	0m4.004s
user	0m0.010s
sys	0m0.008s

real	0m4.019s
user	0m0.010s
sys	0m0.008s

Hence, in 9.0.0, first select if truly great ...... however second SELECT shows significant slowdown.
[10 Jul 2024 13:13] MySQL Verification Team
HI Mr. Song,

First of all, your report provides a lot of profiling info.

That is very important.

However, your test case proved that filesort in 8.4.1 is significantly slower then in 8.0.38, for both tests.

Next, filesort in 9.0.0 is significantly slower for the second test.

Hence, this bug is now a verified bug report for versions 8.4 and 9.0.

Thank you for your contribution.
[26 Aug 2024 7:44] MySQL Verification Team
Bug #115908 marked as duplicate of this one
[2 Sep 2024 0:49] chong zhang
hi, I am currently troubled by the same problem. I know that one of the ways to avoid it afterwards is to create a new index to avoid full table scanning, but I would like to know if there is a good way to discover it in advance? 

There is a significant difference between the file sort source code of 8.0 and 5.7. Is it impossible to maintain consistency in file sort performance between 8.0 and 5.7 by adjusting parameters?

I hope to receive your reply. Thank you
[2 Sep 2024 9:37] Huaxiong Song
Hi, chong zhang

Sorry, there seems to be no good solution at the moment. We can look forward to the community's solution.
If you are using MySQL server compiled from source code, maybe you can add some logs、warnings or pick back the module related to the parameter max_length_for_sort_data
[16 Sep 2024 9:49] MySQL Verification Team
Hi,

Sorry, but right now , there is no good workaround nor any predictability.

We simply will have to wait for a fix to be pushed into 8.0 and 8.4 trees.

Nobody knows when will that happen.