Bug #1801 Wrong Index Scan Direction
Submitted: 11 Nov 2003 0:26 Modified: 1 Dec 2003 4:47
Reporter: Sergei Kulakov (Candidate Quality Contributor) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S4 (Feature request)
Version:3.23.58 OS:FreeBSD (Free BSD)
Assigned to: CPU Architecture:Any

[11 Nov 2003 0:26] Sergei Kulakov
Description:
WHERE condition makes MySQL scan index forward where it could scan it backward. There are cases when you need to get last, not first records from a table, or when you page back through it. A simple example is:

SELECT * FROM SiteLog ORDER BY Id DESC LIMIT 20

This query returns the last 20 records from the table SiteLog, and it does it very quickly which demonstrates it scans the primary index backward (by the way, EXPLAIN shows the value 2905432 in the "rows" column as if the whole table were scanned, which is not true). 
But as soon as there is a WHERE condition on the Id column, it makes MySQL to switch to forward scanning, no matter if it's appropriate or not. A condition might be any, for ex.

SELECT * FROM SiteLog WHERE Id<2905432 ORDER BY Id DESC LIMIT 20

I'd use this if I wanted to page the table back by 20 rows. I know Id of the first row of a range (2905432), and I get the previous range using it. There may be numerous examples of when backward index scan would help, and I think an SQL driver should be able to freely select the right direction without giving favour to any; while so far it prefers forward scan plus "using filesort". As a result, I have to use some tricks to get what I need or just bear with the fact that MySQL performs queries longer that it could. 

How to repeat:
Use ANY comparatively large table with ANY indexed field. Take the queries from the Description area and suit them for your case. Note that sometimes in order to notice the difference in execution time you have to get quite deep into the table because otherwise MySQL performs queries quickly enoguh anyway. 

Suggested fix:
The hint to scan backward is obviously DESC, and probably there could be a reserved word to force a direction.
[1 Dec 2003 1:45] Peter Zaitsev
Sergey,

Actually MySQL has this feature already since MySQL 4.0:

mysql> explain select * from xxx order by i limit 1;
+-------+-------+---------------+------+---------+------+------+-------+
| table | type  | possible_keys | key  | key_len | ref  | rows | Extra |
+-------+-------+---------------+------+---------+------+------+-------+
| xxx   | index | NULL          | i    |       5 | NULL |    5 |       |
+-------+-------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

So upgrading to MySQL 4.0 (which is stable for more than half a year now)  will solve your problem.
[1 Dec 2003 2:16] Sergei Kulakov
Thanks for the answer, Peter. Still I'm not quite sure you have completely understood the problem. At least, the example provided does not use the word DESC, which is the point here! This is what you provided:

mysql> explain select * from xxx order by i limit 1;
+-------+-------+---------------+------+---------+------+------+-------+
| table | type  | possible_keys | key  | key_len | ref  | rows | Extra |
+-------+-------+---------------+------+---------+------+------+-------+
| xxx   | index | NULL          | i    |       5 | NULL |    5 |       |
+-------+-------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

First of all, the query should be at least

select * from xxx order by i DESC limit 1

Secondly, as I described, the query with DESC alone would be okay with even MySQL 3.23.XXX. But when you add an extra contition in WHERE, MySQL scans index forward, not backward. For ex.:

select * from xxx WHERE i<3 order by i DESC limit 1

Note, that the condition in WHERE should be on the same column that is ordered by (and the one in the index), that is "i". If you add conditions on other columns, MySQL behaves right. 
Probably, you need more than 5 rows of data in the table for this to notice the difference in execution time. Say 500000 :)
[1 Dec 2003 2:41] Peter Zaitsev
Yes sorry, that was Typo - I just was checking the query with DESC and without DESC to ensure the explain plan is the same and copied the wrong explain back.
Here is correct one:

mysql> explain select * from xxx order by i desc limit 1;
+-------+-------+---------------+------+---------+------+------+-------+
| table | type  | possible_keys | key  | key_len | ref  | rows | Extra |
+-------+-------+---------------+------+---------+------+------+-------+
| xxx   | index | NULL          | i    |       5 | NULL |    5 |       |
+-------+-------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

Note it works for more complex cases as well:

mysql> explain select * from xxx where i<2 order by i desc limit 1;
+-------+-------+---------------+------+---------+------+------+-------------+
| table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+-------+-------+---------------+------+---------+------+------+-------------+
| xxx   | range | i             | i    |       5 | NULL |    1 | Using where |
+-------+-------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

Please check MySQL 4.0.16 and  submit the full test case for optimizer problem, if you still find it where.
[1 Dec 2003 2:58] Sergei Kulakov
Thanks again. I hope it works right with MySQL 4.X. Unfortunately I can't test this with MySQL 4.X so far. I'll do it in some future when we upgrade it at our server. 

Also, I can't completely determine the direction MySQL 4.X uses by the given explains alone - the EXPLAIN output just doesn't have the column 'Direction'. The differences between 3.23 here is absence of "using filesort" in the column "Extra" (just "using where") and the number of rows scanned 1 instead of 5 (the second seems a true sign). I mean 3.23's EXPLAIN shows index usage as well but it may take a long time for a query like this to execute:

SELECT * FROM PageLog WHERE Id<2905432 ORDER BY Id DESC LIMIT 20
[1 Dec 2003 4:47] Peter Zaitsev
The "direction" is not really needed in this case, however I agree it could be useful in such cases.

If MySQL would not use DESC index scan for this query you would get filesort (as you get for 3.23), as you do not have it for this particular query it looks like index is used right.

Unfortunately the "rows" value in EXPLAIN is not reliable enough - it shows sort of "maximum" amount of rows MySQL may scan. In case of "limit" it is really smaller in case of index scan, but same in case of "filesort".