Bug #17847 | Complex WHERE/ORDER/LIMIT runs much slower with DESC than ASC | ||
---|---|---|---|
Submitted: | 1 Mar 2006 23:57 | Modified: | 4 May 2006 15:47 |
Reporter: | Rick James | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 4.1.19-BK, 4.0.24 | OS: | Linux (Linux, FreeBSD) |
Assigned to: | Evgeny Potemkin | CPU Architecture: | Any |
[1 Mar 2006 23:57]
Rick James
[1 Mar 2006 23:59]
Rick James
show create table, timings, etc
Attachment: mysql17847a.txt (text/plain), 9.21 KiB.
[2 Mar 2006 16:05]
Valeriy Kravchuk
Thank you for a problem report. Are you able to repeat the problem on a smaller number of rows in that table? What are the EXPLAIN results for both your queries?
[2 Mar 2006 16:35]
Rick James
They look identical to me: mysql> explain SELECT m.id, m.thread_id, m.thread_count, m.rating, m.thread_count, thread_rating FROM fMessages2 m USE INDEX (thread_count_idx) WHERE m.board_name='26247' AND m.pid = 0 AND m.status & -214783648 = 0 AND m.options & 1 = 0 AND (m.rating_bucket = 3) ORDER BY m.thread_count ASC, m.id ASC LIMIT 0, 21\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: m type: ref possible_keys: thread_count_idx key: thread_count_idx key_len: 8 ref: const,const,const rows: 167692 Extra: Using where 1 row in set (0.01 sec) mysql> explain SELECT m.id, m.thread_id, m.thread_count, m.rating, m.thread_count, thread_rating FROM fMessages2 m USE INDEX (thread_count_idx) WHERE m.board_name='26247' AND m.pid = 0 AND m.status & -214783648 = 0 AND m.options & 1 = 0 AND (m.rating_bucket = 3) ORDER BY m.thread_count DESC, m.id DESC LIMIT 0, 21\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: m type: ref possible_keys: thread_count_idx key: thread_count_idx key_len: 8 ref: const,const,const rows: 167692 Extra: Using where 1 row in set (0.00 sec)
[2 Mar 2006 16:58]
Valeriy Kravchuk
So, is there any chance to repeat the problem on a smaller amount of data? Can you upload a dump for that table here?
[2 Mar 2006 18:31]
Rick James
More details are found in https://support.mysql.com/8603 I am told that there is a small test case there now.
[27 Mar 2006 14:44]
Konstantin Osipov
Is this a MyISAM table? If so, is it compressed? The slowdown may be due to index/data compression.
[28 Mar 2006 19:40]
Rick James
This bug is essentially a dup of https://support.mysql.com/view.php?id=8603 , which now has sample code attached. (I am told.)
[4 May 2006 15:31]
Evgeny Potemkin
Not a bug. The query slowdown is caused by the data distribution. If you remove the "AND status & -214783648 = 0" part from the WHERE clause then the thread_count_idx will cover all over predicates and queries with ASC and DESC ordering will take same time: mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> SELECT id FROM test m USE INDEX (thread_count_idx) WHERE board_name='26247' AND parent_id = 0 AND rating_bucket = 3 ORDER BY thread_count DESC LIMIT 0, 21; +--------+ | id | +--------+ | 756751 | [skip some rows] | 702443 | +--------+ 21 rows in set (0.00 sec) mysql> show status like 'Handler%'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Handler_commit | 0 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_next | 0 | | Handler_read_prev | 20 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | | Handler_rollback | 0 | | Handler_update | 0 | | Handler_write | 0 | +-----------------------+-------+ 12 rows in set (0.00 sec) mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> SELECT id FROM test m USE INDEX (thread_count_idx) WHERE board_name='26247' AND parent_id = 0 AND rating_bucket = 3 ORDER BY thread_count ASC LIMIT 0, 21; +--------+ | id | +--------+ | 450593 | [skip some rows] | 450682 | +--------+ 21 rows in set (0.00 sec) mysql> show status like 'Handler%'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Handler_commit | 0 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_next | 20 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | | Handler_rollback | 0 | | Handler_update | 0 | | Handler_write | 0 | +-----------------------+-------+ 12 rows in set (0.00 sec) As you can see there is the same number of calls to read_next() and read_prev() methods. Due to 'status' field isn't covered by the index server have to sequentially scan rows obtained with help of the index until it finds an acceptable value. Looking at he distribution we see: mysql> SELECT status FROM test m WHERE board_name='26247' AND parent_id = 0 AND rating_bucket = 3 ORDER BY thread_count deSC LIMIT 0, 21; +--------+ | status | +--------+ | 0 | | 1 | | 0 | | 1 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | +--------+ 21 rows in set (0.00 sec) mysql> SELECT status FROM test m WHERE board_name='26247' AND parent_id = 0 AND rating_bucket = 3 ORDER BY thread_count aSC LIMIT 0, 1001; +-------------+ | status | +-------------+ | -2147483648 | [skip some rows, all of them have the -2147483648 value ] | -2147483648 | +-------------+ 1001 rows in set (0.01 sec) So the query ordered with ASC takes longer time because it takes longer time to find required 'status' value even with help of index.
[4 May 2006 15:31]
Evgeny Potemkin
mysql> select version(); +--------------+ | version() | +--------------+ | 4.1.20-debug | +--------------+ 1 row in set (0.00 sec)
[4 May 2006 15:47]
Rick James
We knew that "AND status..." was a problem. We were pleased that the ASC worked efficiently, then puzzled as to why DESC did not run at the same speed.
[4 May 2006 19:05]
Evgeny Potemkin
All conjunctive predicates in the WHERE condition are covered by the thread_count_idx index except 'status & -214783648 = 0'. To find rows with required values the server sequentially checks all records found by index against the condition 'status & -214783648 = 0' until it finds enough rows. It is like selecting only the 'status' field without any indices. The distribution of the data in the provided test case is such that to find 20 rows with specified values and ordered with DESC takes only 20 checks. Whereas to find 20 rows ordered with ASC in provided test case takes about 42000 checks. Consider an example: The table t1 (a char(5), b int, index a) contains values ('a',1),('a', 2),('a',3),('a',4),('a', 5), ('b',1),('b', 2),('b',3),('b',4),('b', 5). To find the the row ('b', 1) using 'b' as a key value starting from the beginning of the interval for key 'b' we need need only 1 lookup. To find the same value starting from the end we need to check 5 rows.