Bug #67432 Serious performance degradation for the query with ORDER BY ... LIMIT n
Submitted: 31 Oct 2012 5:46 Modified: 11 Dec 2012 1:49
Reporter: Igor Babaev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.6.7 OS:Any
Assigned to: Jørgen Løland CPU Architecture:Any

[31 Oct 2012 5:46] Igor Babaev
Description:
A serious performance degradation can be observed for queries with
 ORDER BY a LIMIT n
when there is an index on the field a and a range condition over field a,
but at the same time optimizer chooses an index-merge scan.

This bug was introduced in 5.6.7  in an attempt to make a cost-based choice between index scans and index-merge scans.
5.5 and prior releases of 5.6 are not affected.

How to repeat:
Take the database from the dump file bug637962.dump attached to the report
https://bugs.launchpad.net/maria/+bug/637962.
Upload the table table100000_myisam_int_autoinc from this file.

Run the query:
SELECT `col_varchar_64_key` FROM `table100000_myisam_int_autoinc`
WHERE ( `col_varchar_64_key` NOT IN ( 'now' , 'rsgxnnowvz' ) OR `col_varchar_64_key` LIKE CONCAT ('Utah' , '%' ) )
AND ( `col_varchar_10_key` BETWEEN 'cr' AND 'really' OR `col_varchar_64_key` IN ( 'j' , 'rcrsgxnn' ) )
AND ( ( `col_varchar_10_key` != 'it' ) OR `col_varchar_10_key` IS NULL )
ORDER BY `col_varchar_64_key` LIMIT 7;

The optimizer chooses a plan with an index-merge scan:
mysql> EXPLAIN
    -> SELECT `col_varchar_64_key` FROM `table100000_myisam_int_autoinc`
    -> WHERE ( `col_varchar_64_key` NOT IN ( 'now' , 'rsgxnnowvz' ) OR `col_varchar_64_key` LIKE CONCAT ('Utah' , '%' ) )
    -> AND ( `col_varchar_10_key` BETWEEN 'cr' AND 'really' OR `col_varchar_64_key` IN ( 'j' , 'rcrsgxnn' ) )
    -> AND ( ( `col_varchar_10_key` != 'it' ) OR `col_varchar_10_key` IS NULL )
    -> ORDER BY `col_varchar_64_key` LIMIT 7 \G

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: table100000_myisam_int_autoinc
         type: index_merge
possible_keys: col_varchar_10_key,col_varchar_64_key
          key: col_varchar_10_key,col_varchar_64_key
      key_len: 13,67
          ref: NULL
         rows: 54453
        Extra: Using sort_union(col_varchar_10_key,col_varchar_64_key); Using where; Using filesort
1 row in set (0.02 sec)

The query is executed about 0.5-1.0 sec.

Suppressing index merge with optimizer_switch='index_merge=off' one has
the execution time about 0.01 sec.

Suggested fix:
When making a cost-based choice between an index scan and an index-merge scan the ORDER BY ... LIMIT n clause must be taken into account.
[5 Nov 2012 8:34] Jørgen Løland
Hi Igor,

Thank you for the bug report. Reproduced using the following simplified test script:

create table t1 (
  pk int primary key auto_increment,
  i int,
  j int,
  k int,
  index (i),
  index (j),
  index (k)
) engine=myisam;

insert into t1 (i,j,k) values (1,1,1);

let $1=12;
set @d=1;
while ($1)
{
  eval insert into t1 (i,j,k) select i+@d, j+@d, k+@d from t1;
  eval set @d=@d*2;
  dec $1;
}

explain select * from t1 where i<1000 and (j<100 or k < 100) order by i limit 5;
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  t1      index_merge     i,j,k   j,k     5,5     NULL    196     Using sort_union(j,k); Using where; Using filesort
[11 Dec 2012 1:49] Paul DuBois
Noted in 5.6.10, 5.7.1 changelogs.

The optimizer sometimes chose a nonoptimimal range scan strategy when
a query included a LIMIT clause.