Bug #73046 Optimizer chooses wrong plan with (const, range) order by range ASC
Submitted: 19 Jun 2014 2:42 Modified: 19 Jun 2014 5:13
Reporter: Morgan Tocker Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.6.19 OS:Any
Assigned to: CPU Architecture:Any

[19 Jun 2014 2:42] Morgan Tocker
Description:
Via: http://stackoverflow.com/questions/24176540/order-by-asc-is-slow-and-using-index-condition

The optimizer seems to be choosing a sub-optimal plan in the case of (const, range) order by range ASC.  A workaround of (range, range) order by range ASC seems to perform fine.

I have confirmed this in 5.6.19 and 5.7 DMR4.

How to repeat:
drop table if exists post;
CREATE TABLE `post` (
  `post_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `owner_id` bigint(20) NOT NULL,
  `data` varchar(300) CHARACTER SET latin1 DEFAULT NULL,
  PRIMARY KEY (`post_id`),
  KEY `my_fk` (`owner_id`)
);

INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM dual;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;
INSERT INTO post SELECT NULL, FLOOR(RAND()*100), 'data' FROM post;

Example queries:

SELECT version();
+-----------+
| version() |
+-----------+
| 5.6.19    |
+-----------+
1 row in set (0.00 sec)

SELECT * FROM post WHERE post_id > 900000 and owner_id = 20 order by post_id DESC limit 10; # fast
+---------+----------+------+
| post_id | owner_id | data |
+---------+----------+------+
| 2424788 |       20 | data |
| 2424783 |       20 | data |
| 2424773 |       20 | data |
| 2424742 |       20 | data |
| 2424620 |       20 | data |
| 2424617 |       20 | data |
| 2424597 |       20 | data |
| 2424501 |       20 | data |
| 2424361 |       20 | data |
| 2424347 |       20 | data |
+---------+----------+------+
10 rows in set (0.00 sec)

SELECT * FROM post WHERE post_id > 900000 and owner_id = 20 order by post_id ASC limit 10; # slow
+---------+----------+------+
| post_id | owner_id | data |
+---------+----------+------+
|  900114 |       20 | data |
|  900121 |       20 | data |
|  900129 |       20 | data |
|  900348 |       20 | data |
|  900385 |       20 | data |
|  900503 |       20 | data |
|  900578 |       20 | data |
|  900583 |       20 | data |
|  900610 |       20 | data |
|  900674 |       20 | data |
+---------+----------+------+
10 rows in set (0.20 sec)

SELECT * FROM post WHERE post_id > 900000 and owner_id > 19 AND owner_id < 21 order by post_id ASC limit 10; # fast
+---------+----------+------+
| post_id | owner_id | data |
+---------+----------+------+
|  900114 |       20 | data |
|  900121 |       20 | data |
|  900129 |       20 | data |
|  900348 |       20 | data |
|  900385 |       20 | data |
|  900503 |       20 | data |
|  900578 |       20 | data |
|  900583 |       20 | data |
|  900610 |       20 | data |
|  900674 |       20 | data |
+---------+----------+------+
10 rows in set (0.00 sec)

The last two queries return identical results.  Here are the results from EXPLAIN:

EXPLAIN SELECT * FROM post WHERE post_id > 900000 and owner_id = 20 order by post_id ASC limit 10; # slow
+----+-------------+-------+-------------+---------------+---------------+---------+------+-------+-------------------------------------------------------------+
| id | select_type | table | type        | possible_keys | key           | key_len | ref  | rows  | Extra                                                       |
+----+-------------+-------+-------------+---------------+---------------+---------+------+-------+-------------------------------------------------------------+
|  1 | SIMPLE      | post  | index_merge | PRIMARY,my_fk | my_fk,PRIMARY | 12,4    | NULL | 14535 | Using intersect(my_fk,PRIMARY); Using where; Using filesort |
+----+-------------+-------+-------------+---------------+---------------+---------+------+-------+-------------------------------------------------------------+
1 row in set (0.00 sec)

EXPLAIN SELECT * FROM post WHERE post_id > 900000 and owner_id > 19 AND owner_id < 21 order by post_id ASC limit 10; # fast
+----+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows    | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+
|  1 | SIMPLE      | post  | range | PRIMARY,my_fk | PRIMARY | 4       | NULL | 1046384 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+
1 row in set (0.01 sec)

Suggested fix:
Make queries #2 and #3 use the same execution plan.
[19 Jun 2014 5:13] Umesh Shastry
Hello Morgan,

Thank you for the report and test case.
Verified as described.

Thanks,
Umesh
[19 Jun 2014 5:28] Umesh Shastry
// 5.7.5

mysql> SELECT * FROM post WHERE post_id > 900000 and owner_id = 20 order by post_id DESC limit 10; # fast
+---------+----------+------+
| post_id | owner_id | data |
+---------+----------+------+
| 2424785 |       20 | data |
| 2424531 |       20 | data |
| 2424482 |       20 | data |
| 2424383 |       20 | data |
| 2424321 |       20 | data |
| 2424066 |       20 | data |
| 2423892 |       20 | data |
| 2423791 |       20 | data |
| 2423783 |       20 | data |
| 2423778 |       20 | data |
+---------+----------+------+
10 rows in set (0.00 sec)

mysql> SELECT * FROM post WHERE post_id > 900000 and owner_id = 20 order by post_id ASC limit 10; # slow
+---------+----------+------+
| post_id | owner_id | data |
+---------+----------+------+
|  900015 |       20 | data |
|  900067 |       20 | data |
|  900068 |       20 | data |
|  900153 |       20 | data |
|  900212 |       20 | data |
|  900479 |       20 | data |
|  900540 |       20 | data |
|  900665 |       20 | data |
|  900821 |       20 | data |
|  900841 |       20 | data |
+---------+----------+------+
10 rows in set (0.13 sec)

mysql> SELECT * FROM post WHERE post_id > 900000 and owner_id > 19 AND owner_id < 21 order by post_id ASC limit 10; # fast
+---------+----------+------+
| post_id | owner_id | data |
+---------+----------+------+
|  900015 |       20 | data |
|  900067 |       20 | data |
|  900068 |       20 | data |
|  900153 |       20 | data |
|  900212 |       20 | data |
|  900479 |       20 | data |
|  900540 |       20 | data |
|  900665 |       20 | data |
|  900821 |       20 | data |
|  900841 |       20 | data |
+---------+----------+------+
10 rows in set (0.00 sec)

mysql> EXPLAIN SELECT * FROM post WHERE post_id > 900000 and owner_id = 20 order by post_id ASC limit 10; # slow
+----+-------------+-------+------------+-------------+---------------+---------------+---------+------+-------+----------+-------------------------------------------------------------+
| id | select_type | table | partitions | type        | possible_keys | key           | key_len | ref  | rows  | filtered | Extra                                                       |
+----+-------------+-------+------------+-------------+---------------+---------------+---------+------+-------+----------+-------------------------------------------------------------+
|  1 | SIMPLE      | post  | NULL       | index_merge | PRIMARY,my_fk | my_fk,PRIMARY | 12,4    | NULL | 14179 |   100.00 | Using intersect(my_fk,PRIMARY); Using where; Using filesort |
+----+-------------+-------+------------+-------------+---------------+---------------+---------+------+-------+----------+-------------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> EXPLAIN SELECT * FROM post WHERE post_id > 900000 and owner_id > 19 AND owner_id < 21 order by post_id ASC limit 10; # fast
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | post  | NULL       | range | PRIMARY,my_fk | PRIMARY | 4       | NULL | 1046240 |   100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
[19 Jun 2014 7:11] Øystein Grøvlen
This is clearly a bug in the range optimizer.  Using index merge intersect when one of the indexes is a clustered primary key does not make sense.  There is nothing to gain from scanning additional indexes in that case.
[10 Feb 2015 1:31] Morgan Tocker
This appears to be fixed in 5.7.6 (using trunk build).
[11 Jun 2015 14:46] Morgan Tocker
Pasting 5.7.7 results (all queries are fast):

mysql [localhost] {msandbox} (test) > explain SELECT * FROM post WHERE post_id > 900000 and owner_id = 20 order by post_id DESC limit 10; # fast
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | post  | NULL       | range | PRIMARY,my_fk | PRIMARY | 4       | NULL | 1046240 |    10.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.01 sec)

mysql [localhost] {msandbox} (test) > explain SELECT * FROM post WHERE post_id > 900000 and owner_id = 20 order by post_id ASC limit 10; # slow
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | post  | NULL       | range | PRIMARY,my_fk | PRIMARY | 4       | NULL | 1046240 |    10.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql [localhost] {msandbox} (test) > explain SELECT * FROM post WHERE post_id > 900000 and owner_id > 19 AND owner_id < 21 order by post_id ASC limit 10; # fast
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | post  | NULL       | range | PRIMARY,my_fk | PRIMARY | 4       | NULL | 1046240 |     1.78 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql [localhost] {msandbox} (test) > select version();
+-----------+
| version() |
+-----------+
| 5.7.7-rc  |
+-----------+
[17 Feb 2016 9:44] Olav Sandstå
The change that makes the optimizer select the same query plan for all three cases and makes the problematic query run fast the implementation of the following worklog:

WL#6986: Make switching of index due to small limit cost based

This worklog was included in 5.7.6.
[17 Feb 2016 11:23] Olav Sandstå
After the optimizer started to use range scan on the PRIMARY key for these queries, all queries complete quickly. But it would likely have been an even better choice if the optimizer selected to use range scan on the "my_fk" index. The optimizer should have been able to detect that this would be a better query plan.