Bug #17673 Optimizer does not use Index Merge optimization in some cases
Submitted: 23 Feb 2006 14:46 Modified: 30 Aug 2012 12:39
Reporter: Valeriy Kravchuk Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.0.19-BK, 5.0.18, 5.1.39 OS:Linux (Linux)
Assigned to: Jørgen Løland CPU Architecture:Any
Triage: Triaged: D5 (Feature request)

[23 Feb 2006 14:46] Valeriy Kravchuk
Description:
In some even not so complicated cases Index Merge optimization is not used. For example:

mysql> CREATE TABLE tmerge2 (
    ->   `c1` int(11) NOT NULL auto_increment,
    ->   `c2` decimal(10,0) default NULL,
    ->   `c3` decimal(10,0) default NULL,
    ->   `c4` decimal(10,0) default NULL,
    ->   `c5` decimal(10,0) default NULL,
    ->   `cp` decimal(1,0) default NULL,
    ->   `ce` decimal(10,0) default NULL,
    ->   `cdata` char(20),
    ->   PRIMARY KEY  (`c1`),
    ->   KEY `k1` (`c2`,`c3`,`cp`,`ce`),
    ->   KEY `k2` (`c4`,`c5`,`cp`,`ce`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into tmerge2 (c2, c3, c4, c5, cp) values(1,1,1,1,1);
Query OK, 1 row affected (0.01 sec)
...
see "How to repeat"
...

mysql> explain select * from tmerge2 where (c2 = 1 and c3 = 1) or (c4 =2 and c5=1)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tmerge2
         type: index_merge
possible_keys: k1,k2
          key: k1,k2
      key_len: 12,12
          ref: NULL
         rows: 2
        Extra: Using sort_union(k1,k2); Using where
1 row in set (0.00 sec)

mysql> explain select * from tmerge2 where (c2 = 1 and c3 = 1 and cp=1) or (c4=2 and c5=1 and cp=1)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tmerge2
         type: index_merge
possible_keys: k1,k2
          key: k1,k2
      key_len: 14,14
          ref: NULL
         rows: 2
        Extra: Using sort_union(k1,k2); Using where
1 row in set (0.00 sec)

mysql> explain select * from tmerge2 where ((c2 = 1 and c3 = 1) or (c4 =2 and c5=1)) and cp=1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tmerge2
         type: ALL
possible_keys: k1,k2
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 5699
        Extra: Using where
1 row in set (0.00 sec)

The last 2 queries are semantically the same and should be optimized in the same way. Moreover, they are both more "selective" than the first one.

How to repeat:
CREATE TABLE tmerge2 (
`c1` int(11) NOT NULL auto_increment,
`c2` decimal(10,0) default NULL,
`c3` decimal(10,0) default NULL,
`c4` decimal(10,0) default NULL,
`c5` decimal(10,0) default NULL,
`cp` decimal(1,0) default NULL,
`ce` decimal(10,0) default NULL,
`cdata` char(20),
PRIMARY KEY  (`c1`),
KEY `k1` (`c2`,`c3`,`cp`,`ce`),
KEY `k2` (`c4`,`c5`,`cp`,`ce`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

insert into tmerge2 (c2, c3, c4, c5, cp) values(1,1,1,1,1);
insert into tmerge2 (c2, c3, c4, c5, cp) values(2,1,1,1,4);
insert into tmerge2 (c2, c3, c4, c5, cp) values(2,1,2,1,1);
insert into tmerge2 (c2, c3, c4, c5, cp) values(2,1,3,1,4);
insert into tmerge2 (c2, c3, c4, c5, cp) values(3,1,4,1,4);

insert into tmerge2 (c2, c3, c4, c5, cp) select c2, c3, c4, c5, cp from
tmerge2 where cp = 4;
...

I repeated the above until I got:

Query OK, 3072 rows affected (0.12 sec)
Records: 3072  Duplicates: 0  Warnings: 0

Then:

explain select * from tmerge2 where (c2 = 1 and c3 = 1) or (c4 =2 and c5
= 1)\G

explain select * from tmerge2 where (c2 = 1 and c3 = 1 and cp = 1) or (c4
 =2 and c5 = 1 and cp=1)\G

explain select * from tmerge2 where ((c2 = 1 and c3 = 1) or (c4 =2 and c5
 = 1)) and cp = 1\G

Suggested fix:
Add some additional checks for possible (great!) Index Merge optimization.
[24 May 2006 17:46] Sergey Petrunya
This problem is a known limitation of the index_merge optimizer.

http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html

If your query has a complex WHERE clause with deep AND/OR nesting and MySQL doesn't choose the optimal plan, try distributing terms using the following identity laws...
[30 Aug 2012 12:39] Jørgen Løland
Fixed in 5.6.6