Bug #81601 Optimizer does not identify index as covering on myisam table
Submitted: 26 May 2016 6:55 Modified: 8 Jun 2016 13:30
Reporter: Jiří Kavalík Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.6.21 OS:Any
Assigned to: CPU Architecture:Any
Tags: covering index, myisam

[26 May 2016 6:55] Jiří Kavalík
Description:
In some cases optimizer fetches full rows even when the index used covers the entire query. If the order of columns in the index is changed, the query is executed index-only (but in the test case that leads to filesort). The index-only plan is used when the index is forced by USE/FORCE INDEX() hint.

How to repeat:
CREATE TABLE `wp_posts` (
    `ID` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
    `post_author` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0',
    `post_date` DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00',
    `post_date_gmt` DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00',
    `post_content` LONGTEXT NOT NULL COLLATE 'utf8mb4_unicode_ci',
    `post_title` TEXT NOT NULL COLLATE 'utf8mb4_unicode_ci',
    `post_excerpt` TEXT NOT NULL COLLATE 'utf8mb4_unicode_ci',
    `post_status` VARCHAR(20) NOT NULL DEFAULT 'publish' COLLATE 'utf8mb4_unicode_ci',
    `comment_status` VARCHAR(20) NOT NULL DEFAULT 'open' COLLATE 'utf8mb4_unicode_ci',
    `ping_status` VARCHAR(20) NOT NULL DEFAULT 'open' COLLATE 'utf8mb4_unicode_ci',
    `post_password` VARCHAR(20) NOT NULL DEFAULT '' COLLATE 'utf8mb4_unicode_ci',
    `post_name` VARCHAR(200) NOT NULL DEFAULT '' COLLATE 'utf8mb4_unicode_ci',
    `to_ping` TEXT NOT NULL COLLATE 'utf8mb4_unicode_ci',
    `pinged` TEXT NOT NULL COLLATE 'utf8mb4_unicode_ci',
    `post_modified` DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00',
    `post_modified_gmt` DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00',
    `post_content_filtered` LONGTEXT NOT NULL COLLATE 'utf8mb4_unicode_ci',
    `post_parent` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0',
    `guid` VARCHAR(255) NOT NULL DEFAULT '' COLLATE 'utf8mb4_unicode_ci',
    `menu_order` INT(11) NOT NULL DEFAULT '0',
    `post_type` VARCHAR(20) NOT NULL DEFAULT 'post' COLLATE 'utf8mb4_unicode_ci',
    `post_mime_type` VARCHAR(100) NOT NULL DEFAULT '' COLLATE 'utf8mb4_unicode_ci',
    `comment_count` BIGINT(20) NOT NULL DEFAULT '0',
    PRIMARY KEY (`ID`),
    INDEX `type_status_date` (`post_type`, `post_status`, `post_date`, `ID`),
    INDEX `type_modified_status` (`post_type`, `post_modified_gmt`, `post_status`),
    INDEX `post_parent` (`post_parent`),
    INDEX `post_author` (`post_author`),
    INDEX `post_name` (`post_name`(191))
)
COLLATE='utf8mb4_unicode_ci'
ENGINE=MyISAM
AUTO_INCREMENT=19833;
insert into wp_posts(post_type, post_status, post_modified_gmt, post_content_filtered, post_content, post_title, post_excerpt, to_ping, pinged) values
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', ''),
('test', 'publish', now(), '', '', '', '', '', '');

explain SELECT post_modified_gmt FROM (
    SELECT @rownum:=@rownum+1 rownum, wp_posts.post_modified_gmt 
    FROM (SELECT @rownum:=0) r, wp_posts
        WHERE post_status IN ('publish','inherit')
        AND post_type = 'test'
        ORDER BY post_modified_gmt ASC
    ) x 
WHERE rownum %1000 = 0;

-- compare
explain SELECT post_modified_gmt FROM (
    SELECT @rownum:=@rownum+1 rownum, wp_posts.post_modified_gmt 
    FROM (SELECT @rownum:=0) r, wp_posts force index(type_modified_status)
        WHERE post_status IN ('publish','inherit')
        AND post_type = 'test'
        ORDER BY post_modified_gmt ASC
    ) x 
WHERE rownum %1000 = 0;
[26 May 2016 7:09] Jiří Kavalík
http://sqlfiddle.com/#!9/1cc46/5
[6 Jun 2016 16:09] MySQL Verification Team
Hi Jiri,

Thank you for your bug report. Can you answer my question. How many rows are returned by the query:

SELECT @rownum:=0) r, wp_posts
       WHERE post_status IN ('publish','inherit')
       AND post_type = 'test'
       ORDER BY post_modified_gmt ASC;

Many thanks in advance !!!!
[7 Jun 2016 6:20] Jiří Kavalík
Hi Sinisa,

I am sorry, but I cannot give an exact number for that. I created this test case on behalf of a dba.stackexchange.com user. According to informations they gave the production sites may have 1M+ posts, probably lot of them matching the condition. But from what I can see the testing table has ~20k rows with the coditions matching ~15k.

The post with these informations - http://dba.stackexchange.com/q/139477/64556
[7 Jun 2016 13:29] MySQL Verification Team
Hi!

Your last reply is sufficient.

It is MUCH more efficient to scan the rows in MyISAM , then use the index, unless index covers less then 20 % of the total number of rows.

This is also explained in our manual.
[7 Jun 2016 13:35] Jiří Kavalík
Hello,

well, thats not the "bug" I am reporting.

The problem is that the index IS covering as far as I can see, but it is not used as such!
[7 Jun 2016 14:29] MySQL Verification Team
Hi !

The fact that the index is covering is irrelevant. So, if fully covering index covers more then 20 % of total number of rows, scanning will be used as it is MUCH faster. That is a heuristic number that we came with after months of measurement, some 12 years ago. 

And it is described in our manual.

Not a bug.
[8 Jun 2016 6:49] Jiří Kavalík
Hi,

"The fact that the index is covering is irrelevant" - just NO. In case of a covering index it should be "full-index-scan" and not "full-table-scan. That will be the same at worst and usually better because secondary indexes tend to be smaller.

But the problem with the example I show is that not even full table scan is used, it uses ref access on the covering index, but then does not say "using index" and from the execution times seen it actually really fetches the row from the table which is just wrong. It may slow the query tremendously to get data which were already available.

> explain SELECT post_modified_gmt FROM (
    ->     SELECT @rownum:=@rownum+1 rownum, wp_posts.post_modified_gmt 
    ->     FROM (SELECT @rownum:=0) r, wp_posts
    ->         WHERE post_status IN ('publish','inherit')
    ->         AND post_type = 'test'
    ->         ORDER BY post_modified_gmt ASC
    ->     ) x 
    -> WHERE rownum %1000 = 0;
+------+-------------+------------+--------+---------------------------------------+----------------------+---------+-------+------+----------------+
| id   | select_type | table      | type   | possible_keys                         | key                  | key_len | ref   | rows | Extra          |
+------+-------------+------------+--------+---------------------------------------+----------------------+---------+-------+------+----------------+
|    1 | PRIMARY     | <derived2> | ALL    | NULL                                  | NULL                 | NULL    | NULL  |   10 | Using where    |
|    2 | DERIVED     | <derived3> | system | NULL                                  | NULL                 | NULL    | NULL  |    1 |                |
|    2 | DERIVED     | wp_posts   | ref    | type_status_date,type_modified_status | type_modified_status | 82      | const |   10 | Using where    |
|    3 | DERIVED     | NULL       | NULL   | NULL                                  | NULL                 | NULL    | NULL  | NULL | No tables used |
+------+-------------+------------+--------+---------------------------------------+----------------------+---------+-------+------+----------------+

> explain SELECT post_modified_gmt FROM (
    ->     SELECT @rownum:=@rownum+1 rownum, wp_posts.post_modified_gmt 
    ->     FROM (SELECT @rownum:=0) r, wp_posts use index(type_modified_status)
    ->         WHERE post_status IN ('publish','inherit')
    ->         AND post_type = 'test'
    ->         ORDER BY post_modified_gmt ASC
    ->     ) x 
    -> WHERE rownum %1000 = 0;
+------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+
| id   | select_type | table      | type   | possible_keys        | key                  | key_len | ref  | rows | Extra                    |
+------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+
|    1 | PRIMARY     | <derived2> | ALL    | NULL                 | NULL                 | NULL    | NULL |   91 | Using where              |
|    2 | DERIVED     | <derived3> | system | NULL                 | NULL                 | NULL    | NULL |    1 |                          |
|    2 | DERIVED     | wp_posts   | index  | type_modified_status | type_modified_status | 172     | NULL |   92 | Using where; Using index |
|    3 | DERIVED     | NULL       | NULL   | NULL                 | NULL                 | NULL    | NULL | NULL | No tables used           |
+------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+

> explain SELECT post_modified_gmt FROM (
    ->     SELECT @rownum:=@rownum+1 rownum, wp_posts.post_modified_gmt 
    ->     FROM (SELECT @rownum:=0) r, wp_posts ignore index(type_status_date)
    ->         WHERE post_status IN ('publish','inherit')
    ->         AND post_type = 'test'
    ->         ORDER BY post_modified_gmt ASC
    ->     ) x 
    -> WHERE rownum %1000 = 0;
+------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+
| id   | select_type | table      | type   | possible_keys        | key                  | key_len | ref  | rows | Extra                    |
+------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+
|    1 | PRIMARY     | <derived2> | ALL    | NULL                 | NULL                 | NULL    | NULL |   91 | Using where              |
|    2 | DERIVED     | <derived3> | system | NULL                 | NULL                 | NULL    | NULL |    1 |                          |
|    2 | DERIVED     | wp_posts   | index  | type_modified_status | type_modified_status | 172     | NULL |   92 | Using where; Using index |
|    3 | DERIVED     | NULL       | NULL   | NULL                 | NULL                 | NULL    | NULL | NULL | No tables used           |
+------+-------------+------------+--------+----------------------+----------------------+---------+------+------+--------------------------+

As you can see the first (imo wrong) explain shows ref access and fetches the row from the heap. The other two show index scan with "using index" (and a bit better rows estimate too). The query is the same so the index is covering and it matters. It mattered in the tests the OP on SE did too. The first case was significantly slower as mysql probably really reads the row even if it does not need to.
It behaves correctly when the other index "type_status_date" is not taken into account and it behaves correctly with InnoDB (no matter the index hints). It might even start to behave with MyISAM after running "analyze table wp_posts;" but what it does in this first case here is just not right.

Sorry for being stubborn here, but from your responses it seems to me you did not get the point I am making (which is a problem of my insufficient explanation so trying to fix it here).
[8 Jun 2016 13:24] MySQL Verification Team
Sufficient information is already provided.
[8 Jun 2016 13:30] Jiří Kavalík
Ok, thank you for looking into it.