Bug #81601 | Optimizer does not identify index as covering on myisam table | ||
---|---|---|---|
Submitted: | 26 May 2016 6:55 | Modified: | 8 Jun 2016 13:30 |
Reporter: | Jiří Kavalík | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.6.21 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | covering index, myisam |
[26 May 2016 6:55]
Jiří Kavalík
[26 May 2016 7:09]
Jiří Kavalík
http://sqlfiddle.com/#!9/1cc46/5
[6 Jun 2016 16:09]
MySQL Verification Team
Hi Jiri, Thank you for your bug report. Can you answer my question. How many rows are returned by the query: SELECT @rownum:=0) r, wp_posts WHERE post_status IN ('publish','inherit') AND post_type = 'test' ORDER BY post_modified_gmt ASC; Many thanks in advance !!!!
[7 Jun 2016 6:20]
Jiří Kavalík
Hi Sinisa, I am sorry, but I cannot give an exact number for that. I created this test case on behalf of a dba.stackexchange.com user. According to informations they gave the production sites may have 1M+ posts, probably lot of them matching the condition. But from what I can see the testing table has ~20k rows with the coditions matching ~15k. The post with these informations - http://dba.stackexchange.com/q/139477/64556
[7 Jun 2016 13:29]
MySQL Verification Team
Hi! Your last reply is sufficient. It is MUCH more efficient to scan the rows in MyISAM , then use the index, unless index covers less then 20 % of the total number of rows. This is also explained in our manual.
[7 Jun 2016 13:35]
Jiří Kavalík
Hello, well, thats not the "bug" I am reporting. The problem is that the index IS covering as far as I can see, but it is not used as such!
[7 Jun 2016 14:29]
MySQL Verification Team
Hi ! The fact that the index is covering is irrelevant. So, if fully covering index covers more then 20 % of total number of rows, scanning will be used as it is MUCH faster. That is a heuristic number that we came with after months of measurement, some 12 years ago. And it is described in our manual. Not a bug.
[8 Jun 2016 6:49]
Jiří Kavalík
Hi, "The fact that the index is covering is irrelevant" - just NO. In case of a covering index it should be "full-index-scan" and not "full-table-scan. That will be the same at worst and usually better because secondary indexes tend to be smaller. But the problem with the example I show is that not even full table scan is used, it uses ref access on the covering index, but then does not say "using index" and from the execution times seen it actually really fetches the row from the table which is just wrong. It may slow the query tremendously to get data which were already available. > explain SELECT post_modified_gmt FROM ( -> SELECT @rownum:=@rownum+1 rownum, wp_posts.post_modified_gmt -> FROM (SELECT @rownum:=0) r, wp_posts -> WHERE post_status IN ('publish','inherit') -> AND post_type = 'test' -> ORDER BY post_modified_gmt ASC -> ) x -> WHERE rownum %1000 = 0; +------+-------------+------------+--------+---------------------------------------+----------------------+---------+-------+------+----------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+------------+--------+---------------------------------------+----------------------+---------+-------+------+----------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 2 | DERIVED | <derived3> | system | NULL | NULL | NULL | NULL | 1 | | | 2 | DERIVED | wp_posts | ref | type_status_date,type_modified_status | type_modified_status | 82 | const | 10 | Using where | | 3 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used | +------+-------------+------------+--------+---------------------------------------+----------------------+---------+-------+------+----------------+ > explain SELECT post_modified_gmt FROM ( -> SELECT @rownum:=@rownum+1 rownum, wp_posts.post_modified_gmt -> FROM (SELECT @rownum:=0) r, wp_posts use index(type_modified_status) -> WHERE post_status IN ('publish','inherit') -> AND post_type = 'test' -> ORDER BY post_modified_gmt ASC -> ) x -> WHERE rownum %1000 = 0; +------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 91 | Using where | | 2 | DERIVED | <derived3> | system | NULL | NULL | NULL | NULL | 1 | | | 2 | DERIVED | wp_posts | index | type_modified_status | type_modified_status | 172 | NULL | 92 | Using where; Using index | | 3 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used | +------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+ > explain SELECT post_modified_gmt FROM ( -> SELECT @rownum:=@rownum+1 rownum, wp_posts.post_modified_gmt -> FROM (SELECT @rownum:=0) r, wp_posts ignore index(type_status_date) -> WHERE post_status IN ('publish','inherit') -> AND post_type = 'test' -> ORDER BY post_modified_gmt ASC -> ) x -> WHERE rownum %1000 = 0; +------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 91 | Using where | | 2 | DERIVED | <derived3> | system | NULL | NULL | NULL | NULL | 1 | | | 2 | DERIVED | wp_posts | index | type_modified_status | type_modified_status | 172 | NULL | 92 | Using where; Using index | | 3 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used | +------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+ As you can see the first (imo wrong) explain shows ref access and fetches the row from the heap. The other two show index scan with "using index" (and a bit better rows estimate too). The query is the same so the index is covering and it matters. It mattered in the tests the OP on SE did too. The first case was significantly slower as mysql probably really reads the row even if it does not need to. It behaves correctly when the other index "type_status_date" is not taken into account and it behaves correctly with InnoDB (no matter the index hints). It might even start to behave with MyISAM after running "analyze table wp_posts;" but what it does in this first case here is just not right. Sorry for being stubborn here, but from your responses it seems to me you did not get the point I am making (which is a problem of my insufficient explanation so trying to fix it here).
[8 Jun 2016 13:24]
MySQL Verification Team
Sufficient information is already provided.
[8 Jun 2016 13:30]
Jiří Kavalík
Ok, thank you for looking into it.