Bug #77172 Optimizer should support index skip scan
Submitted: 27 May 2015 13:26 Modified: 20 Nov 2015 9:46
Reporter: Morgan Tocker Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S4 (Feature request)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[27 May 2015 13:26] Morgan Tocker
Description:
The MySQL optimizer can currently use an index for satisfying an ORDER BY (index scan) or filtering predicates (ref, range).  There are cases where it would be very useful if it could use an index for both with a skip scan.

How to repeat:
From: http://www.tocker.ca/2013/09/05/migrating-from-postgresql-to-mysql.html

I have also spotted this discussed here (I didn't use this specific workaround):
http://explainextended.com/2009/04/02/emulating-skip-scan/

CREATE TABLE items (
 item_id INT NOT NULL PRIMARY KEY auto_increment,
 sort_order INT NOT NULL,
 community_id INT NOT NULL,
 ranged_key1 INT NOT NULL,
 ranged_key2 INT NOT NULL,
 is_deleted INT NULL,
 published_at INT NULL,
 created_at INT NULL,
 INDEX(community_id, sort_order, ranged_key1)
);

SELECT * FROM items
 WHERE
 community_id=10
 and ranged_key1 between 7 and 9
 and ranged_key2 between 50 and 110
ORDER BY sort_order DESC LIMIT 100;

This is the Facebook feed query pattern:
- community_id is a particular user or category page.
- sort_order is an application maintained priority order, roughly equivalent to a timestamp (values can be inserted as -(sort_order) if MySQL does not support DESC indexes).
- ranged_key1 is some other qualifying predicate.  Maybe price range, or a quality score threshold.
- There is always a small LIMIT.

So what I would like to do is:
- Ref filter on community_id=10 (is not very selective)
- Scan on sort_order (since it is correct order)
- Range on ranged_key1 until 100 rows are found.

This optimization is a little specific, and it doesn't necessarily need full cost model support (would be willing to use a hint).  But the current workaround is very expensive and invasive to the application, which is to use two tables:

CREATE TABLE items (
 item_id INT NOT NULL PRIMARY KEY auto_increment,
 community_id INT NOT NULL,
 ranged_key1 INT NOT NULL,
 ranged_key2 INT NOT NULL,
 is_deleted INT NULL,
 published_at INT NULL,
 created_at INT NULL
);
CREATE TABLE item_sort_order (
 item_id INT NOT NULL PRIMARY KEY,
 community_id INT NOT NULL,
 sort_order INT NOT NULL,
 INDEX (community_id, sort_order) /* InnoDB: Includes item_id as clustered index */
);

SELECT STRAIGHT_JOIN items.*
FROM
item_sort_order INNER JOIN items USING (item_id)
WHERE
  items.community_id = N
 AND item_sort_order.community_id = N
 AND items.ranged_key1 BETWEEN x AND y
 AND items.ranged_key2 BETWEEN x AND y
ORDER BY item_sort_order.sort_order DESC LIMIT 100;
[18 Jun 2016 21:26] Omer Barnir
Posted by developer:
 
Reported version value updated to reflect release name change from 5.8 to 8.0
[10 Feb 2019 8:13] Jean-François Gagné
Could we consider that fixed in 8.0.13 (see [1]) ?

[1]: https://dev.mysql.com/doc/refman/8.0/en/range-optimization.html#range-access-skip-scan