Bug #67589 Picking a Bad Index to avoid a tiny FileSort when using an Order BY
Submitted: 14 Nov 2012 18:01 Modified: 29 Nov 2012 18:53
Reporter: Eric Hernandez Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1 OS:Linux
Assigned to: CPU Architecture:Any
Tags: order by

[14 Nov 2012 18:01] Eric Hernandez
Description:
Hi, 
The MySQL 5.1.65 optimizer is picking a poor plan when using an "ORDER BY" in a correlated subquery.
The 5.1 optimizer is picking an index that eliminates a "FILE SORT" over an index that would filter rows for the correlated subquery. 

On MySQL 5.0.58 it picks the correct index that restricts rows for the correlated subquery and does not try to eliminate a tiny filesort. 

In my examples I am working with a small data and both queries run sub second.  But in production when this bug manifested itself on a much larger dataset the query went from sub second to over an hour and ultimately had to be killed because of MyISAM table locks.  We had to scramble and change our code to use a force index.

Is this a regression bug?

How to repeat:
# How to duplicate bug on 5.1
#Server version:		5.1.65 MySQL Community Server (GPL)
 
DROP TABLE IF EXISTS alpha;
CREATE TABLE `alpha` (
  `a_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`a_id`)
) ENGINE=MyISAM ;

DROP TABLE IF EXISTS bravo;
CREATE TABLE `bravo` (
  `b_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `a_id` int(10) unsigned NOT NULL DEFAULT '0',
  `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`b_id`),
  KEY `idx_a_id` (`a_id`)
) ENGINE=MyISAM  ;
 

DROP PROCEDURE IF EXISTS dorepeat; 
delimiter //
CREATE PROCEDURE dorepeat(p1 INT) BEGIN SET @x = 0; REPEAT SET @x = @x +1; INSERT INTO alpha VALUES (NULL);  UNTIL @x > p1 END REPEAT; END//
delimiter ; 
 
call dorepeat(10000000);
 
INSERT INTO bravo (a_id) SELECT a_id FROM alpha; INSERT INTO bravo (a_id) SELECT a_id FROM alpha; 
 
ANALYZE TABLE alpha; ANALYZE TABLE bravo;

SHOW INDEXES FROM alpha;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| alpha |          0 | PRIMARY  |            1 | a_id        | A         |    10000001 |     NULL | NULL   |      | BTREE      |         | 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

SHOW INDEXES FROM bravo;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| bravo |          0 | PRIMARY  |            1 | b_id        | A         |    20000002 |     NULL | NULL   |      | BTREE      |         | 
| bravo |          1 | idx_a_id |            1 | a_id        | A         |    10000001 |     NULL | NULL   |      | BTREE      |         | 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

 
EXPLAIN SELECT SQL_NO_CACHE a.a_id, (SELECT b.ts FROM bravo AS b WHERE b.a_id = a.a_id ORDER BY b_id LIMIT 1) AS ts FROM alpha AS a WHERE a_id > 999999 LIMIT 1 \G
 
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: a
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 8999571
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: b
         type: index
possible_keys: idx_a_id
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 1
        Extra: Using where
2 rows in set (0.01 sec)

SELECT SQL_NO_CACHE a.a_id, (SELECT b.ts FROM bravo AS b WHERE b.a_id = a.a_id ORDER BY b_id LIMIT 1) AS ts FROM alpha AS a WHERE a_id > 999999 LIMIT 1 \G
*************************** 1. row ***************************
a_id: 1000000
  ts: 2012-11-14 08:58:32
1 row in set (0.61 sec)

EXPLAIN SELECT SQL_NO_CACHE a.a_id, (SELECT b.ts FROM bravo AS b FORCE INDEX (idx_a_id) WHERE b.a_id = a.a_id ORDER BY b_id LIMIT 1) AS ts FROM alpha AS a WHERE a_id > 999999 LIMIT 1 \G

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: a
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 8999571
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: b
         type: ref
possible_keys: idx_a_id
          key: idx_a_id
      key_len: 4
          ref: test.a.a_id
         rows: 2
        Extra: Using where; Using filesort
2 rows in set (0.00 sec)

SELECT SQL_NO_CACHE a.a_id, (SELECT b.ts FROM bravo AS b FORCE INDEX (idx_a_id) WHERE b.a_id = a.a_id ORDER BY b_id LIMIT 1) AS ts FROM alpha AS a WHERE a_id > 999999 LIMIT 1 \G
*************************** 1. row ***************************
a_id: 1000000
  ts: 2012-11-14 08:58:32
1 row in set (0.01 sec)

### Same query on 

Server version:		5.0.58-enterprise-gpl MySQL Enterprise Server (GPL)

EXPLAIN SELECT SQL_NO_CACHE a.a_id, (SELECT b.ts FROM bravo AS b WHERE b.a_id = a.a_id ORDER BY b_id LIMIT 1) AS ts FROM alpha AS a WHERE a_id > 999999 LIMIT 1 \G

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: a
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 8999571
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: b
         type: ref
possible_keys: idx_a_id
          key: idx_a_id
      key_len: 4
          ref: test.a.a_id
         rows: 2
        Extra: Using where; Using filesort
2 rows in set (0.00 sec)

(db131.tss:test:12)$ SELECT SQL_NO_CACHE a.a_id, (SELECT b.ts FROM bravo AS b WHERE b.a_id = a.a_id ORDER BY b_id LIMIT 1) AS ts FROM alpha AS a WHERE a_id > 999999 LIMIT 1 \G
*************************** 1. row ***************************
a_id: 1000000
  ts: 2012-11-14 09:09:08
1 row in set (0.00 sec)
[29 Nov 2012 18:53] Sveta Smirnova
Thank you for the report.

This is fixed in version 5.6.8. Please upgrade.