Bug #101220 | Filesort used for ORDER BY ...DESC even when DESCending index available and used | ||
---|---|---|---|
Submitted: | 18 Oct 2020 0:37 | Modified: | 23 Feb 2021 18:44 |
Reporter: | Marcos Albe (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 8.0.21 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[18 Oct 2020 0:37]
Marcos Albe
[18 Oct 2020 9:13]
Frederic Descamps
Hi Marcos, Could you also add the output of EXPLAIN ANALYZE when available please ? Thx
[18 Oct 2020 13:01]
Sveta Smirnova
Hi Fred, here it is: mysql> explain analyze SELECT t.col1, t.col2, t.col3, t.col4, t.col5, t.col6, 3356 as _shard FROM test t WHERE t.col2 IN (x'dfb88e9081d443d2b72e9f8d32ee4d1c') AND t.col3 IN ('WRITEBACK') ORDER BY t.col2 DESC, t.col3 DESC, t.col4 DESC LIMIT 1\G *************************** 1. row *************************** EXPLAIN: -> Limit: 1 row(s) (actual time=0.277..0.277 rows=0 loops=1) -> Filter: (t.col2 = 0xdfb88e9081d443d2b72e9f8d32ee4d1c) (cost=1.10 rows=1) (actual time=0.274..0.274 rows=0 loops=1) -> Index lookup on t using uc_key (col2=0xdfb88e9081d443d2b72e9f8d32ee4d1c, col3='WRITEBACK'; iterate backwards) (cost=1.10 rows=1) (actual time=0.271..0.271 rows=0 loops=1) 1 row in set (0.00 sec)
[18 Oct 2020 13:20]
Sveta Smirnova
Sorry, previous output was with set optimizer_switch='index_condition_pushdown=off'; Correct output with default optimizer_switch: mysql> explain analyze SELECT t.col1, t.col2, t.col3, t.col4, t.col5, t.col6, 3356 as _shard FROM test t WHERE t.col2 IN (x'dfb88e9081d443d2b72e9f8d32ee4d1c') AND t.col3 IN ('WRITEBACK') ORDER BY t.col2 DESC, t.col3 DESC, t.col4 DESC LIMIT 1\G *************************** 1. row *************************** EXPLAIN: -> Limit: 1 row(s) (actual time=0.302..0.302 rows=0 loops=1) -> Sort row IDs: t.col3 DESC, t.col4 DESC, limit input to 1 row(s) per chunk (cost=1.09 rows=1) (actual time=0.299..0.299 rows=0 loops=1) -> Index lookup on t using uc_key (col2=0xdfb88e9081d443d2b72e9f8d32ee4d1c, col3='WRITEBACK'), with index condition: (t.col2 = 0xdfb88e9081d443d2b72e9f8d32ee4d1c) (actual time=0.230..0.230 rows=0 loops=1) 1 row in set (0.00 sec)
[19 Oct 2020 8:13]
MySQL Verification Team
Hello Marcos, Thank you for the report and test case. Thanks, Umesh
[19 Oct 2020 8:14]
MySQL Verification Team
In my test setup, DESCending index available but not used: mysql> CREATE TABLE `test` ( -> `col1` bigint unsigned NOT NULL AUTO_INCREMENT, -> `col2` binary(16) NOT NULL, -> `col3` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL, -> `col4` bigint NOT NULL, -> `col5` timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6), -> `col6` mediumblob NOT NULL, -> PRIMARY KEY (`col1`), -> UNIQUE KEY `uc_key` (`col2`,`col3`,`col4`) -> ) ; Query OK, 0 rows affected (0.01 sec) - Populate data ./mysql_random_data_load -h127.0.0.1 -P3333 -uroot test test 100000 INFO[2020-10-19T10:02:26+02:00] Starting 24s [====================================================================] 100% INFO[2020-10-19T10:02:51+02:00] 100000 rows inserted [umshastr@hod03]/export/umesh/utils: ./mysql_random_data_load -h127.0.0.1 -P3333 -uroot test test 100000 INFO[2020-10-19T10:08:35+02:00] Starting 24s [====================================================================] 100% INFO[2020-10-19T10:09:00+02:00] 100000 rows inserted mysql> analyze table test; +-----------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +-----------+---------+----------+----------+ | test.test | analyze | status | OK | +-----------+---------+----------+----------+ 1 row in set (0.02 sec) mysql> explain SELECT t.col1, t.col2, t.col3, t.col4, t.col5, t.col6, 3356 as _shard FROM test t WHERE t.col2 IN (x'dfb88e9081d443d2b72e9f8d32ee4d1c') AND t.col3 IN ('WRITEBACK') ORDER BY t.col2 DESC, t.col3 DESC, t.col4 DESC LIMIT 1; +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+---------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+---------------------------------------+ | 1 | SIMPLE | t | NULL | ref | uc_key | uc_key | 782 | const,const | 1 | 100.00 | Using index condition; Using filesort | +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+---------------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql> alter table test add index desc_idx(col2 DESC, col3 DESC, col4 DESC); Query OK, 0 rows affected (0.35 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> analyze table test; +-----------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +-----------+---------+----------+----------+ | test.test | analyze | status | OK | +-----------+---------+----------+----------+ 1 row in set (0.01 sec) mysql> explain SELECT t.col1, t.col2, t.col3, t.col4, t.col5, t.col6, 3356 as _shard FROM test t WHERE t.col2 IN (x'dfb88e9081d443d2b72e9f8d32ee4d1c') AND t.col3 IN ('WRITEBACK') ORDER BY t.col2 DESC, t.col3 DESC, t.col4 DESC LIMIT 1; +----+-------------+-------+------------+------+-----------------+--------+---------+-------------+------+----------+---------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+-----------------+--------+---------+-------------+------+----------+---------------------------------------+ | 1 | SIMPLE | t | NULL | ref | uc_key,desc_idx | uc_key | 782 | const,const | 1 | 100.00 | Using index condition; Using filesort | +----+-------------+-------+------------+------+-----------------+--------+---------+-------------+------+----------+---------------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql> explain analyze SELECT t.col1, t.col2, t.col3, t.col4, t.col5, t.col6, 3356 as _shard FROM test t WHERE t.col2 IN (x'dfb88e9081d443d2b72e9f8d32ee4d1c') AND t.col3 IN ('WRITEBACK') ORDER BY t.col2 -> DESC, t.col3 DESC, t.col4 DESC LIMIT 1\G *************************** 1. row *************************** EXPLAIN: -> Limit: 1 row(s) (actual time=0.065..0.065 rows=0 loops=1) -> Sort row IDs: t.col3 DESC, t.col4 DESC, limit input to 1 row(s) per chunk (cost=0.35 rows=1) (actual time=0.063..0.063 rows=0 loops=1) -> Index lookup on t using uc_key (col2=0xdfb88e9081d443d2b72e9f8d32ee4d1c, col3='WRITEBACK'), with index condition: (t.col2 = 0xdfb88e9081d443d2b72e9f8d32ee4d1c) (actual time=0.042..0.042 rows=0 loops=1) 1 row in set (0.01 sec)
[3 Nov 2020 14:26]
Sveta Smirnova
Use of collation utf8mb4_0900_ai_ci is a workaround: mysql> explain SELECT t.col1, t.col2, t.col3, t.col4, t.col5, t.col6, 3356 as _shard FROM test t WHERE t.col2 IN (x'dfb88e9081d443d2b72e9f8d32ee4d1c') AND t.col3 IN ('WRITEBACK') ORDER BY t.col2 DESC, t.col3 DESC, t.col4 DESC LIMIT 1; +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+---------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+---------------------------------------+ | 1 | SIMPLE | t | NULL | ref | uc_key | uc_key | 782 | const,const | 1 | 100.00 | Using index condition; Using filesort | +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+---------------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql> alter table test modify `col3` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL; Query OK, 100000 rows affected (1.21 sec) Records: 100000 Duplicates: 0 Warnings: 0 mysql> explain SELECT t.col1, t.col2, t.col3, t.col4, t.col5, t.col6, 3356 as _shard FROM test t WHERE t.col2 IN (x'dfb88e9081d443d2b72e9f8d32ee4d1c') AND t.col3 IN ('WRITEBACK') ORDER BY t.col2 DESC, t.col3 DESC, t.col4 DESC LIMIT 1; +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+----------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+----------------------------------+ | 1 | SIMPLE | t | NULL | ref | uc_key | uc_key | 782 | const,const | 1 | 100.00 | Using where; Backward index scan | +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+----------------------------------+ 1 row in set, 1 warning (0.00 sec)
[15 Jan 2021 10:49]
Zsolt Parragi
After some investigation and debugging: this is not a bug. 5.7 and 8.0 behave differently because they use different default (connection) collations. With 5.7, it was utf8mb4_general_ci, with 8.0 it is now utf8mb4_0900_ai_ci. While this doesn't matter for the ORDER BY clause (that uses the column collation), the issue is caused by the WHERE condition on col3 (AND t.col3 IN ('WRITEBACK')). This is easily verifiable: by removing this part of the query, it'll use an index scan. As explained by the mysql documentation (https://dev.mysql.com/doc/refman/8.0/en/charset-collation-coercibility.html) for client provided strings, the connection collation takes precedence over the column collation. The issue here is that col3 uses utf8mb4_general_ci, and the connection uses utf8mb4_0900_ai_ci. These collations handle equality differently (with accented characters), and because of this, mysql can't use the index. So to fix the query, the user will have to change how the query is executed. One possibility is setting the connection collation also to general_ci: SET collation_connection = utf8mb4_general_ci; Another possibility is forcing the WHERE condition to use general_ci instead of the connection collation: and t.col3 IN ('WRITEBACK' COLLATE utf8mb4_general_ci)
[4 Feb 2021 8:14]
Roy Lyseng
This is a bug after all. The collation for the column in the WHERE clause, i.e. utf8mb4_general_ci, takes precedence, not the connection collation. The code checks if an ORDER BY sub-clause can be eliminated in test_if_equality_guarantees_uniqueness(). This function returns false if collation of left and right hand side of equality predicate are different. However, this should not make a difference as long as the collation of the column in the index matches the collation of the equality operation. And these should be the same since they are both derived from the collation of the column. Changing the connection collation will indeed fix the problem, however, there is also a bug here that makes the check too tight.
[23 Feb 2021 18:44]
Jon Stephens
Documented fix as follows in the MySQL 8.0.24 changelog: Filesort was used for a query having an ORDER BY ... DESC clause, even when an index on the descending column was available and used. This happened because an ORDER BY sub-clause was not removed due to matching a field in an equality predicate, even though it should have, so that the optimizer did not match the query with the descending index, leading to non-optimal performance. Closed.