MySQL Bugs: #30151: optimizer is very reluctant to chose index

Bug #30151	optimizer is very reluctant to chose index_merge algorithm
Submitted:	31 Jul 2007 15:25	Modified:	30 Aug 2012 12:28
Reporter:	Axel Schwenke	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S4 (Feature request)
Version:	5.0.48-bk, 5.1.34	OS:	Any
Assigned to:	Jørgen Løland	CPU Architecture:	Any
Tags:	bfsm_2007_08_02

Description:
SELECT * FROM table WHERE (uniq1='foo' OR uniq2='bar') AND idx3='baz'

Even with UNIQUE columns uniq1 and uniq2 and a very bad index on idx3 (only 2 different values in 500.000 rows) the optimizer choses the index on idx3. This makes the query much slower.

Below is a completely synthetic example. In the real world issue columns uniq1 and uniq2 are *nearly* unique and the table is the first table in a JOIN. The execution time drops from ~10 seconds to 0 when the bad index is ignored.

I tested with both InnoDB and MyISAM - same results. Also I testetd with none, one and both of (uniq1, uniq2) declared UNIQUE. In the example below both are declared UNIQUE.

How to repeat:
drop table if exists t1;
create table t1 (id int unsigned auto_increment primary key, c1 char(12), c2 char(15), c3 char(1)) engine innodb;
insert into t1 (c3) values ('1'), ('2');
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;
insert into t1 (c3) select c3 from t1;

update t1 set c1=lpad(id+1000, 12, ' '), c2=lpad(id+10000, 15, ' ');
alter table t1 add unique index (c1), add unique index (c2), add index (c3);

analyze table t1;

select *, 'default' plan from t1 where (c1='      100000' or c2='         2000000') and c3='2';
+-------+--------------+-----------------+------+---------+
| id    | c1           | c2              | c3   | plan    |
+-------+--------------+-----------------+------+---------+
| 99000 |       100000 |          109000 | 2    | default | 
+-------+--------------+-----------------+------+---------+
1 row in set (0,93 sec)

select *, 'ignore index (c3)' plan from t1 ignore index (c3) where (c1='      100000' or c2='         2000000') and c3='2';
+-------+--------------+-----------------+------+-------------------+
| id    | c1           | c2              | c3   | plan              |
+-------+--------------+-----------------+------+-------------------+
| 99000 |       100000 |          109000 | 2    | ignore index (c3) | 
+-------+--------------+-----------------+------+-------------------+
1 row in set (0,00 sec)

Suggested fix:
Calculate correct query costs and chose the better plan

test case

Attachment: testcase.sql (application/octet-stream, text), 1.46 KiB.

- This is a feature request: a new functionality is required.
- The customer did not specify what he actually needs in general.

By the above reasons I move the bug to 'To be fixed later' and mark it as a 'Feature request'.
Product management will decide in what version this optimization appears.

Fixed in 5.6.6