Description:
I am seeing a performance regression after 8.0.19 on a WordPress generated `SELECT ... ORDER BY ... LIMIT` query.
The query is as follows:
SELECT wp_ale_posts.* FROM wp_ale_posts FORCE INDEX(type_status_date)WHERE (post_type = 'post' AND (post_status = 'publish' OR post_status = 'future' OR post_status = 'draft' OR post_status = 'pending' OR post_status = 'in-progress' OR post_status = 'private')) ORDER BY post_date DESC LIMIT 20, 10;
This query runs fast on 8.0.19 and is much slower (roughly 30 times on average) on 8.0.20+ servers.
The query needs a filesort, since the `order by` clause can't be satisfied by the index. The explain results is the same on both versions:
mysql> explain SELECT wp_ale_posts.* FROM wp_ale_posts FORCE INDEX(type_status_date)WHERE (post_type = 'post' AND (post_status = 'publish' OR post_status = 'future' OR post_status = 'draft' OR post_status = 'pending' OR post_status = 'in-progress' OR post_status = 'private')) ORDER BY post_date DESC LIMIT 20, 10;
+----+-------------+--------------+------------+-------+------------------+------------------+---------+------+--------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------+------------+-------+------------------+------------------+---------+------+--------+----------+---------------------------------------+
| 1 | SIMPLE | wp_ale_posts | NULL | range | type_status_date | type_status_date | 164 | NULL | 929207 | 100.00 | Using index condition; Using filesort |
+----+-------------+--------------+------------+-------+------------------+------------------+---------+------+--------+----------+---------------------------------------+
1 row in set, 1 warning (0.09 sec)
How to repeat:
I am reproducing with
docker run --platform linux/amd64 --name mysql-8.0.20 -e MYSQL_ROOT_PASSWORD=ale -d mysql:8.0.20
docker exec -it mysql:8.0.20 /bin/bash
mysql -u root -p -e "create database ale"
mysql -u root -p ale < /tmp/wp_ale_posts.sqldump.sql
mysql> SELECT wp_ale_posts.* FROM wp_ale_posts FORCE INDEX(type_status_date)WHERE (post_type = 'post' AND (post_status = 'publish' OR post_status = 'future' OR post_status = 'draft' OR post_status = 'pending' OR post_status = 'in-progress' OR post_status = 'private')) ORDER BY post_date DESC LIMIT 20, 10;
10 rows in set (3 min 21.78 sec)
If I repeat the same steps on 8.0.19 I get:
mysql> SELECT wp_ale_posts.* FROM wp_ale_posts FORCE INDEX(type_status_date)WHERE (post_type = 'post' AND (post_status = 'publish' OR post_status = 'future' OR post_status = 'draft' OR post_status = 'pending' OR post_status = 'in-progress' OR post_status = 'private')) ORDER BY post_date DESC LIMIT 20, 10;
10 rows in set (5.29 sec)
Suggested fix:
I think what changed is that 8.0.19 only reads the items it needs for sorting from disk, while 8.0.20+ (I tested most of the versions) read the entire row.
This seems to have happened with commit https://github.com/mysql/mysql-server/commit/37e372ae3afd1188de24311ef9f4e96e009e911b, which stopped modifying the read_set used for filesort.
This, from my understanding, only happens when sorting row IDs, which is the case for WordPress post tables given the presence of `LONGTEXT` for the post content.
I tried patching the old behaviour back into newer versions and the query gets fast again, the path I used against 8.0.28 is here:
https://drive.google.com/file/d/1AVGGVyD731JtgvCnQ8tcncXZYRw_CLLq/view?usp=share_link
i am also providing a link to the sqldump I am using for testing, it's quite big at 10GB, so downloading it might take a while. You can download it here:
https://drive.google.com/file/d/1CAGHte_vnjn7wQhkPP4fd-XcdD7wPqlb/view?usp=share_link