Bug #110998 Performance regression with SELECT query when sorting on limited fields
Submitted: 11 May 2023 15:33 Modified: 17 May 2023 6:38
Reporter: Ale Crismani Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0.20+, 8.0.33 OS:Linux
Assigned to: CPU Architecture:Any
Tags: regression

[11 May 2023 15:33] Ale Crismani
Description:
I am seeing a performance regression after 8.0.19 on a WordPress generated `SELECT ... ORDER BY ... LIMIT` query.

The query is as follows:

SELECT wp_ale_posts.* FROM wp_ale_posts FORCE INDEX(type_status_date)WHERE (post_type = 'post' AND (post_status = 'publish' OR post_status = 'future' OR post_status = 'draft' OR post_status = 'pending' OR post_status = 'in-progress' OR post_status = 'private')) ORDER BY post_date DESC LIMIT 20, 10;

This query runs fast on 8.0.19 and is much slower (roughly 30 times on average) on 8.0.20+ servers.

The query needs a filesort, since the `order by` clause can't be satisfied by the index. The explain results is the same on both versions:
mysql> explain SELECT wp_ale_posts.* FROM wp_ale_posts FORCE INDEX(type_status_date)WHERE (post_type = 'post' AND (post_status = 'publish' OR post_status = 'future' OR post_status = 'draft' OR post_status = 'pending' OR post_status = 'in-progress' OR post_status = 'private')) ORDER BY post_date DESC LIMIT 20, 10;
+----+-------------+--------------+------------+-------+------------------+------------------+---------+------+--------+----------+---------------------------------------+
| id | select_type | table        | partitions | type  | possible_keys    | key              | key_len | ref  | rows   | filtered | Extra                                 |
+----+-------------+--------------+------------+-------+------------------+------------------+---------+------+--------+----------+---------------------------------------+
|  1 | SIMPLE      | wp_ale_posts | NULL       | range | type_status_date | type_status_date | 164     | NULL | 929207 |   100.00 | Using index condition; Using filesort |
+----+-------------+--------------+------------+-------+------------------+------------------+---------+------+--------+----------+---------------------------------------+
1 row in set, 1 warning (0.09 sec)

How to repeat:
I am reproducing with

docker run --platform linux/amd64 --name mysql-8.0.20 -e MYSQL_ROOT_PASSWORD=ale -d mysql:8.0.20

docker exec -it mysql:8.0.20 /bin/bash
mysql -u root -p -e "create database ale"
mysql -u root -p ale < /tmp/wp_ale_posts.sqldump.sql

mysql> SELECT wp_ale_posts.* FROM wp_ale_posts FORCE INDEX(type_status_date)WHERE (post_type = 'post' AND (post_status = 'publish' OR post_status = 'future' OR post_status = 'draft' OR post_status = 'pending' OR post_status = 'in-progress' OR post_status = 'private')) ORDER BY post_date DESC LIMIT 20, 10;
10 rows in set (3 min 21.78 sec)  

If I repeat the same steps on 8.0.19 I get:
mysql> SELECT wp_ale_posts.* FROM wp_ale_posts FORCE INDEX(type_status_date)WHERE (post_type = 'post' AND (post_status = 'publish' OR post_status = 'future' OR post_status = 'draft' OR post_status = 'pending' OR post_status = 'in-progress' OR post_status = 'private')) ORDER BY post_date DESC LIMIT 20, 10;
10 rows in set (5.29 sec) 

Suggested fix:
I think what changed is that 8.0.19 only reads the items it needs for sorting from disk, while 8.0.20+ (I tested most of the versions) read the entire row.

This seems to have happened with commit https://github.com/mysql/mysql-server/commit/37e372ae3afd1188de24311ef9f4e96e009e911b, which stopped modifying the read_set used for filesort.

This, from my understanding, only happens when sorting row IDs, which is the case for WordPress post tables given the presence of `LONGTEXT` for the post content.
 

I tried patching the old behaviour back into newer versions and the query gets fast again, the path I used against 8.0.28 is here:
https://drive.google.com/file/d/1AVGGVyD731JtgvCnQ8tcncXZYRw_CLLq/view?usp=share_link

i am also providing a link to the sqldump I am using for testing, it's quite big at 10GB, so downloading it might take a while. You can download it here:
https://drive.google.com/file/d/1CAGHte_vnjn7wQhkPP4fd-XcdD7wPqlb/view?usp=share_link
[12 May 2023 12:41] MySQL Verification Team
Hi Mr. Crismani,

Thank you very much for your bug report.

However, we have had so many reports about performance regressions on this same issue, so that this problem is true and very well known.

Among many very similar and verified reports, we declare that your report is a duplicate of :

https://bugs.mysql.com/bug.php?id=108531

Most of these reports are private and hidden. Some optimisations are planned for the version 8.1, while 8.0 will enter a maintenance phase.

We have added your report as one of the many on the same subject.

Duplicate.
[17 May 2023 6:38] MySQL Verification Team
Thank you for the feedback.
[17 May 2023 6:39] MySQL Verification Team
Test results - 8.0.11, 8.0.19, 8.0.20 and 8.0.33

Attachment: 110998_8.0.11_19_20_33.results (application/octet-stream, text), 17.54 KiB.