Description:
As opposed to the manual, http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html:
"If a range scan is possible on some key, an Index Merge is not considered."
it is not only considered, but also choosed by optimizer, even when range access based on one index scans much less rows, and is simply faster:
openxs@linux:~/dbs/5.0> bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.0.46-debug Source distribution
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> create table tm(col1 int auto_increment primary key,
-> col2 varchar(127),
-> col3 varchar(8),
-> col4 datetime,
-> col5 tinyint,
-> col6 tinyint,
-> key idx_2(col3),
-> key idx_3(col4),
-> key idx_4(col5),
-> key idx_5(col6)) engine=InnoDB;
Query OK, 0 rows affected (0.06 sec)
mysql> insert into tm(col3,col4,col5,col6) values('const', '2007-06-28 07:10:00', 0, 0);
Query OK, 1 row affected (0.01 sec)
mysql> insert into tm(col3,col4,col5,col6) values('const', now(), 1, 2);
Query OK, 1 row affected (0.01 sec)
mysql> insert into tm(col3,col4,col5,col6) values('var', now(), 2, 3);
Query OK, 1 row affected (0.00 sec)
mysql> insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
...
mysql> insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm w
here col6 <> 0;
Query OK, 2048 rows affected (0.22 sec)
Records: 2048 Duplicates: 0 Warnings: 0
mysql> select * from tm where col3='const'
-> and date_sub('2007-06-29 07:10:00', interval 10 second) > col4
-> and col5=0
-> and col6=0;
+------+------+-------+---------------------+------+------+
| col1 | col2 | col3 | col4 | col5 | col6 |
+------+------+-------+---------------------+------+------+
| 1 | NULL | const | 2007-06-28 07:10:00 | 0 | 0 |
+------+------+-------+---------------------+------+------+
1 row in set (0.01 sec)
mysql> explain select * from tm where col3='const' and date_sub('2007-06-29 07:
10:00', interval 10 second) > col4 and col5=0 and col6=0;
+----+-------------+-------+------+-------------------------+-------+---------+-
------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len |
ref | rows | Extra |
+----+-------------+-------+------+-------------------------+-------+---------+-
------+------+-------------+
| 1 | SIMPLE | tm | ref | idx_2,idx_3,idx_4,idx_5 | idx_4 | 2 |
const | 1 | Using where |
+----+-------------+-------+------+-------------------------+-------+---------+-
------+------+-------------+
1 row in set (0.00 sec)
So, now, with single row, it is OK.
mysql> analyze table tm;
+---------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+---------+---------+----------+----------+
| test.tm | analyze | status | OK |
+---------+---------+----------+----------+
1 row in set (0.01 sec)
mysql> explain select * from tm where col3='const' and date_sub('2007-06-29 07:
10:00', interval 10 second) > col4 and col5=0 and col6=0;
+----+-------------+-------+------+-------------------------+-------+---------+-
------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len |
ref | rows | Extra |
+----+-------------+-------+------+-------------------------+-------+---------+-
------+------+-------------+
| 1 | SIMPLE | tm | ref | idx_2,idx_3,idx_4,idx_5 | idx_4 | 2 |
const | 1 | Using where |
+----+-------------+-------+------+-------------------------+-------+---------+-
------+------+-------------+
1 row in set (0.00 sec)
Let's contnue:
mysql> insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm;
Query OK, 4097 rows affected (0.38 sec)
Records: 4097 Duplicates: 0 Warnings: 0
mysql> insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm;
Query OK, 8194 rows affected (0.62 sec)
Records: 8194 Duplicates: 0 Warnings: 0
So, now we have 4 rows for that WHERE. Let's check again:
mysql> explain select * from tm where col3='const' and date_sub('2007-06-29 07:
10:00', interval 10 second) > col4 and col5=0 and col6=0;
+----+-------------+-------+-------------+-------------------------+------------
-+---------+------+------+-------------------------------------------+
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra |
+----+-------------+-------+-------------+-------------------------+------------
-+---------+------+------+-------------------------------------------+
| 1 | SIMPLE | tm | index_merge | idx_2,idx_3,idx_4,idx_5 | idx_4,idx_5
| 2,2 | NULL | 1 | Using intersect(idx_4,idx_5); Using where |
+----+-------------+-------+-------------+-------------------------+------------
-+---------+------+------+-------------------------------------------+
1 row in set (0.00 sec)
mysql> select * from tm where col3='const' and date_sub('2007-06-29 07:10:00',
interval 10 second) > col4 and col5=0 and col6=0;
+-------+------+-------+---------------------+------+------+
| col1 | col2 | col3 | col4 | col5 | col6 |
+-------+------+-------+---------------------+------+------+
| 1 | NULL | const | 2007-06-28 07:10:00 | 0 | 0 |
| 4098 | NULL | const | 2007-06-28 07:10:00 | 0 | 0 |
| 8195 | NULL | const | 2007-06-28 07:10:00 | 0 | 0 |
| 12292 | NULL | const | 2007-06-28 07:10:00 | 0 | 0 |
+-------+------+-------+---------------------+------+------+
4 rows in set (0.01 sec)
So, now index merge (intersect) is used. Somehow optmizer decided that only one row will satisfy all conditions, while we have 4 that satisfy each one of them + we need some time and resources for merging!
The main problem, formally, is that optimizer does NOT work as described in the manual, though. We do have index that can be used for range scan:
mysql> explain select * from tm force index(idx_3) where col3='const' and date_
sub('2007-06-29 07:10:00', interval 10 second) > col4 and col5=0 and col6=0;
+----+-------------+-------+-------+---------------+-------+---------+------+---
---+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | ro
ws | Extra |
+----+-------------+-------+-------+---------------+-------+---------+------+---
---+-------------+
| 1 | SIMPLE | tm | range | idx_3 | idx_3 | 9 | NULL |
3 | Using where |
+----+-------------+-------+-------+---------------+-------+---------+------+---
---+-------------+
1 row in set (0.00 sec)
and in reality this access path is the best. Running ANALYZE/CHECK etc does not help.
How to repeat:
create table tm(
col1 int auto_increment primary key,
col2 varchar(127),
col3 varchar(8),
col4 datetime,
col5 tinyint,
col6 tinyint,
key idx_2(col3),
key idx_3(col4),
key idx_4(col5),
key idx_5(col6)) engine=InnoDB;
insert into tm(col3,col4,col5,col6) values('const', '2007-06-28 07:10:00', 0, 0);
insert into tm(col3,col4,col5,col6) values('const', now(), 1, 2);
insert into tm(col3,col4,col5,col6) values('var', now(), 2, 3);
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm where col6 <> 0;
explain select * from tm where col3='const' and date_sub('2007-06-29 07:10:00', interval 10 second) > col4 and col5=0 and col6=0;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm;
insert into tm(col3,col4,col5,col6) select col3,col4,col5,col6 from tm;
explain select * from tm where col3='const' and date_sub('2007-06-29 07:10:00', interval 10 second) > col4 and col5=0 and col6=0;
explain select * from tm force index(idx_3) where col3='const' and date_sub('2007-06-29 07:10:00', interval 10 second) > col4 and col5=0 and col6=0;
Suggested fix:
Do it as described in the manual?
Calculate rows/costs properly for index_merge?