Bug #17847 Complex WHERE/ORDER/LIMIT runs much slower with DESC than ASC
Submitted: 1 Mar 2006 23:57 Modified: 4 May 2006 15:47
Reporter: Rick James Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:4.1.19-BK, 4.0.24 OS:Linux (Linux, FreeBSD)
Assigned to: Evgeny Potemkin CPU Architecture:Any

[1 Mar 2006 23:57] Rick James
Description:
Summary:

SELECT  id, thread_id, thread_count, rating, thread_count, thread_rating
FROM  Messages m USE INDEX (thread_count_idx)
WHERE  board_name='26247' AND parent_id = 0 
AND status & -214783648 = 0 AND options & 1 = 0  AND (rating_bucket = 3)
ORDER BY thread_count ASC, id ASC 
LIMIT 0, 21;

21 rows in set (1.99 sec)

Changing ORDER BY to:
ORDER BY thread_count DESC, id DESC 

21 rows in set (0.00 sec)

----------------------------
more details (schema, etc) attached.

How to repeat:
Working on getting a managable dataset to for reproducing it.
[1 Mar 2006 23:59] Rick James
show create table, timings, etc

Attachment: mysql17847a.txt (text/plain), 9.21 KiB.

[2 Mar 2006 16:05] Valeriy Kravchuk
Thank you for a problem report. Are you able to repeat the problem on a smaller number of rows in that table? What are the EXPLAIN results for both your queries?
[2 Mar 2006 16:35] Rick James
They look identical to me:

mysql> explain SELECT  m.id, m.thread_id, m.thread_count, m.rating, m.thread_count, thread_rating FROM  fMessages2 m USE INDEX (thread_count_idx) WHERE  m.board_name='26247' AND m.pid = 0  AND m.status & -214783648 = 0 AND m.options & 1 = 0  AND (m.rating_bucket = 3)  ORDER BY m.thread_count ASC, m.id ASC  LIMIT 0, 21\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: m
         type: ref
possible_keys: thread_count_idx
          key: thread_count_idx
      key_len: 8
          ref: const,const,const
         rows: 167692
        Extra: Using where
1 row in set (0.01 sec)

mysql> explain SELECT  m.id, m.thread_id, m.thread_count, m.rating, m.thread_count, thread_rating FROM  fMessages2 m USE INDEX (thread_count_idx) WHERE  m.board_name='26247' AND m.pid = 0  AND m.status & -214783648 = 0 AND m.options & 1 = 0  AND (m.rating_bucket = 3)  ORDER BY m.thread_count DESC, m.id DESC  LIMIT 0, 21\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: m
         type: ref
possible_keys: thread_count_idx
          key: thread_count_idx
      key_len: 8
          ref: const,const,const
         rows: 167692
        Extra: Using where
1 row in set (0.00 sec)
[2 Mar 2006 16:58] Valeriy Kravchuk
So, is there any chance to repeat the problem on a smaller amount of data? Can you upload a dump for that table here?
[2 Mar 2006 18:31] Rick James
More details are found in
https://support.mysql.com/8603 
I am told that there is a small test case there now.
[27 Mar 2006 14:44] Konstantin Osipov
Is this a MyISAM table? If so, is it compressed? The slowdown may be due to index/data compression.
[28 Mar 2006 19:40] Rick James
This bug is essentially a dup of https://support.mysql.com/view.php?id=8603 , which now has sample code attached.  (I am told.)
[4 May 2006 15:31] Evgeny Potemkin
Not a bug. The query slowdown is caused by the data distribution. 
If you remove the "AND status & -214783648 = 0" part from the WHERE clause then the thread_count_idx will cover all over predicates and queries with ASC and DESC ordering will take same time:

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT  id FROM test m USE INDEX (thread_count_idx) WHERE  board_name='26247' AND parent_id = 0 AND rating_bucket = 3  ORDER BY thread_count DESC LIMIT 0, 21;
+--------+
| id     |
+--------+
| 756751 |
[skip some rows]
| 702443 |
+--------+
21 rows in set (0.00 sec)

mysql> show status like 'Handler%'; 
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_commit        | 0     |
| Handler_delete        | 0     |
| Handler_discover      | 0     |
| Handler_read_first    | 0     |
| Handler_read_key      | 1     |
| Handler_read_next     | 0     |
| Handler_read_prev     | 20    |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
| Handler_rollback      | 0     |
| Handler_update        | 0     |
| Handler_write         | 0     |
+-----------------------+-------+
12 rows in set (0.00 sec)

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT  id FROM test m USE INDEX (thread_count_idx) WHERE  board_name='26247' AND parent_id = 0 AND rating_bucket = 3  ORDER BY thread_count ASC LIMIT 0, 21;
+--------+
| id     |
+--------+
| 450593 |
[skip some rows]
| 450682 |
+--------+
21 rows in set (0.00 sec)

mysql> show status like 'Handler%'; 
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_commit        | 0     |
| Handler_delete        | 0     |
| Handler_discover      | 0     |
| Handler_read_first    | 0     |
| Handler_read_key      | 1     |
| Handler_read_next     | 20    |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
| Handler_rollback      | 0     |
| Handler_update        | 0     |
| Handler_write         | 0     |
+-----------------------+-------+
12 rows in set (0.00 sec)

As you can see there is the same number of calls to read_next() and read_prev() methods.
Due to 'status' field isn't covered by the index server have to sequentially scan rows obtained with help of the index until it finds an acceptable value.

Looking at he distribution we see:
mysql> SELECT  status FROM test m WHERE  board_name='26247' AND parent_id = 0 AND rating_bucket = 3  ORDER BY thread_count deSC LIMIT 0, 21;
+--------+
| status |
+--------+
|      0 |
|      1 |
|      0 |
|      1 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
|      0 |
+--------+
21 rows in set (0.00 sec)

mysql> SELECT  status FROM test m WHERE  board_name='26247' AND parent_id = 0 AND rating_bucket = 3  ORDER BY thread_count aSC LIMIT 0, 1001;
+-------------+
| status      |
+-------------+
| -2147483648 |
[skip some rows, all of them have the -2147483648 value ]
| -2147483648 |
+-------------+
1001 rows in set (0.01 sec)

So the query ordered with ASC takes longer time because it takes longer time to find required 'status' value even with help of index.
[4 May 2006 15:31] Evgeny Potemkin
mysql> select version();
+--------------+
| version()    |
+--------------+
| 4.1.20-debug |
+--------------+
1 row in set (0.00 sec)
[4 May 2006 15:47] Rick James
We knew that "AND status..." was a problem.  We were pleased that the ASC worked efficiently, then puzzled as to why DESC did not run at the same speed.
[4 May 2006 19:05] Evgeny Potemkin
All conjunctive predicates in the WHERE condition are covered by the
thread_count_idx index except 'status & -214783648 = 0'. To find rows with
required values the server sequentially checks all records found by index 
against the condition 'status & -214783648 = 0' until it finds enough rows. 
It is like selecting only the 'status' field without any indices. The 
distribution of the data in the provided test case is such that to find 20 
rows with specified values and ordered with DESC takes only 20 checks. Whereas
to find 20 rows ordered with ASC in provided test case takes about 42000 checks.
Consider an example: 
The table t1 (a char(5), b int, index a) contains values
('a',1),('a', 2),('a',3),('a',4),('a', 5),
('b',1),('b', 2),('b',3),('b',4),('b', 5).
To find the the row ('b', 1) using 'b' as a key value starting from the 
beginning of the interval for key 'b' we need need only 1 lookup.
To find the same value starting from the end we need to check 5 rows.