Bug #47884 MySQL 5.1.39 failed to use index
Submitted: 7 Oct 2009 3:24 Modified: 29 Jan 2010 16:53
Reporter: Dinh Pham Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.1.39, 5.1.44, 5.5.99 OS:Linux (Linux web2 2.6.18-92.1.18.el5.028stab060.8 #1 SMP Mon Feb 9 21:44:50 MSK 2009 x86_64)
Assigned to: CPU Architecture:Any
Tags: INDEX, MySQL5.1, performance, regression, wrongindex

[7 Oct 2009 3:24] Dinh Pham
Description:
I set up a master-master replication system using MySQL 5.0.45 Enterprise Edition (legacy) and MySQL 5.1.39 Community (new, testing) where MySQL 5.0.45 is used as the master and MySQL 5.1.39 as slave. The purpose that I did not use master-slave replication is to de-attach MySQL 5.0.45 when possible for upgrading. Both MySQL instances run in a separate 64 bit CentOS server.

However, after slow query logging is enabled, I noticed that MySQL 5.1.39 performed poorly on queries whose indexes are not used correctly (by query optimizer of course), compared to MySQL 5.0

Query:

MySQL 5.0.45 (performance is rather consistent)

mysql> SELECT clip.id FROM clip
    -> WHERE clip.available = 1
    -> AND clip.created_time >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK)
    -> AND clip.category_id = 5
    -> ORDER BY clip.comment_count DESC
    -> LIMIT 20;
+--------+
| id     |
+--------+
| 548272 |
| 552610 |
| 552139 |
| 551410 |
| 550582 |
| 549592 |
| 546693 |
| 546458 |
| 546358 |
| 546337 |
| 546329 |
| 546306 |
| 552736 |
+--------+
13 rows in set (0.03 sec)

MySQL 5.1.39 (performance is rather consistent)

mysql> SELECT clip.id FROM clip
    -> WHERE clip.available = 1
    -> AND clip.created_time >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK)
    -> AND clip.category_id = 5
    -> ORDER BY clip.comment_count DESC
    -> LIMIT 20;
+--------+
| id     |
+--------+
| 552736 |
| 552610 |
| 552139 |
| 551410 |
| 550582 |
| 549592 |
| 548272 |
| 546693 |
| 546458 |
| 546358 |
| 546337 |
| 546329 |
| 546306 |
+--------+
13 rows in set (3.45 sec)

EXPLAIN:

MySQL 5.0.45 (it used id_category_evailable INDEX which is correct)

mysql> EXPLAIN SELECT clip.id FROM clip  WHERE clip.available = 1  AND clip.created_time >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK) AND clip.category_id = 5 ORDER BY clip.comment_count DESC LIMIT 20\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: clip
         type: ref
possible_keys: IDX_POSTEDTIME,idx_available,idx_fav_count,idx_hit_count,idx_available_id,idx_category_id,idx_available_comment,id_category_evailable
          key: id_category_evailable
      key_len: 4
          ref: const,const
         rows: 4188
        Extra: Using where; Using filesort
1 row in set (0.00 sec)

MySQL 5.1.39 (it used idx_available_comment INDEX which is not as optimized as in MySQL 5.0)

mysql> EXPLAIN SELECT clip.id FROM clip
    -> WHERE clip.available = 1
    -> AND clip.created_time >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK)
    -> AND clip.category_id = 5
    -> ORDER BY clip.comment_count DESC
    -> LIMIT 20\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: clip
         type: range
possible_keys: IDX_POSTEDTIME,idx_available,idx_fav_count,idx_hit_count,idx_available_id,idx_category_id,idx_available_comment,id_category_evailable
          key: idx_available_comment
      key_len: 2
          ref: NULL
         rows: 473859
        Extra: Using where
1 row in set (0.40 sec)

############

Evaluation:

As you can see, MySQL 5.1.39 use wrong index. As the result, it performed poorly compared to MySQL 5.0

Now I wanted to force MySQL 5.1.39 to use the correct index

mysql> SELECT clip.id FROM clip FORCE INDEX(id_category_evailable)
    -> WHERE clip.available = 1
    -> AND clip.created_time >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK)
    -> AND clip.category_id = 5
    -> ORDER BY clip.comment_count DESC
    -> LIMIT 20;
+--------+
| id     |
+--------+
| 546306 |
| 552610 |
| 552139 |
| 551410 |
| 550582 |
| 549592 |
| 548272 |
| 546693 |
| 546458 |
| 546358 |
| 546337 |
| 546329 |
| 552736 |
+--------+
13 rows in set (0.02 sec)

As you can see the performance is very good

How to repeat:
clip table
----------

CREATE TABLE `clip` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `upload_id` varchar(100) CHARACTER SET latin1 NOT NULL,
  `encoded_id` varchar(10) NOT NULL,
  `title_url` varchar(250) CHARACTER SET latin1 NOT NULL,
  `title` varchar(255) NOT NULL,
  `duration` smallint(5) unsigned DEFAULT '0',
  `thumbnail` varchar(255) DEFAULT NULL,
  `thumb_version` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `member_id` int(10) unsigned NOT NULL DEFAULT '0',
  `username` varchar(45) CHARACTER SET latin1 NOT NULL,
  `category_id` smallint(5) unsigned NOT NULL DEFAULT '0',
  `tag_list` varchar(255) NOT NULL,
  `public` tinyint(3) unsigned DEFAULT '0',
  `point` float(9,2) unsigned DEFAULT '0.00',
  `rating` bigint(20) unsigned DEFAULT '0',
  `hit_count` int(10) unsigned DEFAULT '0',
  `active` tinyint(3) DEFAULT '0',
  `created_time` datetime DEFAULT NULL,
  `description` text,
  `converted` tinyint(3) DEFAULT '0',
  `deleted` tinyint(3) DEFAULT '0',
  `comment_count` int(10) unsigned DEFAULT '0',
  `fav_count` int(10) unsigned DEFAULT '0',
  `share_count` int(10) unsigned DEFAULT '0',
  `apiDevID` int(11) DEFAULT NULL,
  `promoted` tinyint(3) unsigned DEFAULT '0',
  `edited` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `available` tinyint(3) unsigned DEFAULT '0',
  `alert_point` int(11) DEFAULT '0',
  `custom_video_link` varchar(200) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `IDX_POSTEDTIME` (`created_time`),
  KEY `idx_memid` (`member_id`),
  KEY `idx_available` (`available`),
  KEY `idx_converted` (`converted`),
  KEY `idx_active` (`active`),
  KEY `idx_deleted` (`deleted`),
  KEY `idx_encoded_id` (`encoded_id`),
  KEY `idx_fav_count` (`available`,`fav_count`),
  KEY `idx_hit_count` (`available`,`hit_count`),
  KEY `idx_available_id` (`available`,`id`),
  KEY `idx_category_id` (`category_id`),
  KEY `idx_available_comment` (`available`,`comment_count`),
  KEY `idx_share_count` (`share_count`),
  KEY `id_category_evailable` (`category_id`,`available`),
  FULLTEXT KEY `FT_TITLE` (`title`),
  FULLTEXT KEY `FT_TAGS` (`tag_list`),
  FULLTEXT KEY `FT_DESC` (`description`),
  FULLTEXT KEY `FT_TITLE_DESC_TAGS` (`title`,`description`,`tag_list`)
) ENGINE=MyISAM AUTO_INCREMENT=553172 DEFAULT CHARSET=utf8

Index info:
mysql> show indexes from clip;
+-------+------------+-----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name              | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+-----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| clip  |          0 | PRIMARY               |            1 | id            | A         |      543205 |     NULL | NULL   |      | BTREE      |         |
| clip  |          1 | IDX_POSTEDTIME        |            1 | created_time  | A         |      543205 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_memid             |            1 | member_id     | A         |       60356 |     NULL | NULL   |      | BTREE      |         |
| clip  |          1 | idx_available         |            1 | available     | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_converted         |            1 | converted     | A         |           3 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_active            |            1 | active        | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_deleted           |            1 | deleted       | A         |           7 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_encoded_id        |            1 | encoded_id    | A         |      108641 |     NULL | NULL   |      | BTREE      |         |
| clip  |          1 | idx_fav_count         |            1 | available     | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_fav_count         |            2 | fav_count     | A         |         194 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_hit_count         |            1 | available     | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_hit_count         |            2 | hit_count     | A         |       17522 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_available_id      |            1 | available     | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_available_id      |            2 | id            | A         |      543205 |     NULL | NULL   |      | BTREE      |         |
| clip  |          1 | idx_category_id       |            1 | category_id   | A         |          80 |     NULL | NULL   |      | BTREE      |         |
| clip  |          1 | idx_available_comment |            1 | available     | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_available_comment |            2 | comment_count | A         |         297 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | idx_share_count       |            1 | share_count   | A         |           6 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | id_category_evailable |            1 | category_id   | A         |          80 |     NULL | NULL   |      | BTREE      |         |
| clip  |          1 | id_category_evailable |            2 | available     | A         |         160 |     NULL | NULL   | YES  | BTREE      |         |
| clip  |          1 | FT_TITLE              |            1 | title         | NULL      |           1 |     NULL | NULL   |      | FULLTEXT   |         |
| clip  |          1 | FT_TAGS               |            1 | tag_list      | NULL      |           1 |     NULL | NULL   |      | FULLTEXT   |         |
| clip  |          1 | FT_DESC               |            1 | description   | NULL      |           1 |     NULL | NULL   | YES  | FULLTEXT   |         |
| clip  |          1 | FT_TITLE_DESC_TAGS    |            1 | title         | NULL      |           1 |     NULL | NULL   |      | FULLTEXT   |         |
| clip  |          1 | FT_TITLE_DESC_TAGS    |            2 | description   | NULL      |           1 |     NULL | NULL   | YES  | FULLTEXT   |         |
| clip  |          1 | FT_TITLE_DESC_TAGS    |            3 | tag_list      | NULL      |           1 |     NULL | NULL   |      | FULLTEXT   |         |
+-------+------------+-----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
[7 Oct 2009 3:41] Valeriy Kravchuk
This bug looks like a regression related to "fix" for bug #28404. See bug #46011 also as yet another example of this regression. 

The only reason why this may be NOT a duplicate of bug #46011 is the fact that you do not order by PRIMARY KEY here.
[7 Oct 2009 6:36] Dinh Pham
For "ORDER BY comment_count", yes, I don't have any index on the field "comment_count". I have an index "idx_available_comment" on 2 fields: comment_count, available

mysql> explain SELECT clip.id FROM clip
    -> WHERE clip.available = 1
    -> AND clip.created_time >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK)
    -> AND clip.category_id = 5
    -> ORDER BY clip.comment_count DESC
    -> LIMIT 20\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: clip
         type: range
possible_keys: IDX_POSTEDTIME,idx_available,idx_fav_count,idx_hit_count,idx_available_id,idx_category_id,idx_available_comment,id_category_evailable
          key: idx_available_comment
      key_len: 2
          ref: NULL
         rows: 473916
        Extra: Using where
1 row in set (0.36 sec)

MySQL use the index named idx_available_comment that I don't know what operation it is for.
[7 Oct 2009 6:43] Dinh Pham
Now I tried to remove the phrase ORDER BY to see if there is any change in MySQL behavior

mysql> explain SELECT clip.id FROM clip
    -> WHERE clip.available = 1
    -> AND clip.created_time >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK)
    -> AND clip.category_id = 5
    -> LIMIT 20\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: clip
         type: ref
possible_keys: IDX_POSTEDTIME,idx_available,idx_fav_count,idx_hit_count,idx_available_id,idx_category_id,idx_available_comment,id_category_evailable
          key: id_category_evailable
      key_len: 4
          ref: const,const
         rows: 4185
        Extra: Using where
1 row in set (0.00 sec)

It is OK now. Ref const is used and the correct index is used: id_category_evailable 

Do you think that new query optimizer in MySQL 5.1.39 have choose wrong execution plan when ORDER BY is present?
[7 Oct 2009 7:39] Valeriy Kravchuk
Yes, it ORDER BY (+LIMIT maybe) that forces optimizer in recent versions to use index that allows to avoid filesort step. This change in behavior is related to (not entirely correct) fix to bug #28404 we made.

This index, KEY `idx_available_comment` (`available`,`comment_count`), is used to find rows where available=1 and then scan them is order of comment_count. This avoids filesort, but in your case optimizer's assumption about filesort as very slow step and something to avoid is totally wrong. Looks like you just have too many rows with available=1 comparing to total number of rows in table. Please, check.
[7 Oct 2009 8:12] Dinh Pham
> Looks like you just have too many rows with available=1 comparing
> to total number of rows in table. Please, check.

Yes. Here is the detail:

mysql> SELECT COUNT(1) FROM clip WHERE available = 1;
+----------+
| COUNT(1) |
+----------+
|   461937 |
+----------+
1 row in set (0.22 sec)

mysql> SELECT COUNT(1) FROM clip;
+----------+
| COUNT(1) |
+----------+
|   543357 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(1) FROM clip WHERE category_id = 5;
+----------+
| COUNT(1) |
+----------+
|    10558 |
+----------+
1 row in set (0.26 sec)

That's reason why I have added an index on 2 fields: category_id and available. This index is used by MySQL 5.0.45 efficiently but not by MySQL 5.1.39

Regarding to the fix to http://bugs.mysql.com/bug.php?id=28404 I can not found any report that it is fixed in

http://dev.mysql.com/doc/refman/5.1/en/news-5-1-39.html
http://dev.mysql.com/doc/refman/5.1/en/news-5-1-38.html
http://dev.mysql.com/doc/refman/5.1/en/news-5-1-37.html
http://dev.mysql.com/doc/refman/5.1/en/news-5-1-36.html

What should I do now to avoid this regression bug?

Thanks
[7 Oct 2009 8:34] Valeriy Kravchuk
Bug #28404 is NOT yet fixed completely and properly. So, surely there is no fix in 5.1.39.

As for immediate solution, just use FORCE INDEX to force proper index usage:

SELECT clip.id FROM clip FORCE INDEX(id_category_evailable)
WHERE clip.available = 1
AND clip.created_time >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK)
AND clip.category_id = 5
ORDER BY clip.comment_count DESC
LIMIT 20;

I can not easily repeat the problem with recent 5.1.40 and my table. Looks like it depends on very specific data distribution. Can you create similar table with only relevant columns (id, available, created_time, category_id and comment_count) and indexes based on current one and upload compressed dump of it to the issue?
[8 Nov 2009 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[8 Nov 2009 13:11] Dinh Pham
Hi,

I have uploaded my sample data to mediafire.com. It can be reached at http://www.mediafire.com/download.php?ygeyzjict4h

I will remove it when you finish downloading.

Sample queries

SELECT clip.id FROM clip
WHERE clip.available = 1
AND clip.created_time >= DATE_SUB(CURDATE(), INTERVAL 8 WEEK)
AND clip.category_id = 5
ORDER BY clip.comment_count DESC
LIMIT 20;

SELECT clip.id FROM clip FORCE INDEX (id_category_evailable)
WHERE clip.available = 1
AND clip.created_time >= DATE_SUB(CURDATE(), INTERVAL 8 WEEK)
AND clip.category_id = 5
ORDER BY clip.comment_count DESC
LIMIT 20;

Thanks
[8 Nov 2009 14:12] Valeriy Kravchuk
I've downloaded the file successfully.
[29 Jan 2010 16:53] Sveta Smirnova
Thank you for the feedback.

Set to "Verified", because bug is repeatable and you don't order by primary key.
[29 Nov 2012 17:27] Sveta Smirnova
See also bug #67589