Bug #23322 | Optimizer sometimes erroniously prefers other index over index merge | ||
---|---|---|---|
Submitted: | 16 Oct 2006 5:50 | Modified: | 31 Aug 2012 11:29 |
Reporter: | Valeriy Kravchuk | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
Version: | 5.0.28-BK, 5.0.27-BK, 5.0.24, 5.1.37-bzr | OS: | Linux (Linux) |
Assigned to: | Assigned Account | CPU Architecture: | Any |
Tags: | bfsm_2006_12_07 |
[16 Oct 2006 5:50]
Valeriy Kravchuk
[17 Oct 2006 17:45]
Igor Babaev
This is not a bug: See our 5.0 manual: [7.2.6. Index Merge Optimization] << Note: The Index Merge optimization algorithm has the following known deficiencies: If a range scan is possible on some key, an Index Merge is not considered. For example, consider this query: SELECT * FROM t1 WHERE (goodkey1 < 10 OR goodkey2 < 20) AND badkey < 30; For this query, two plans are possible: An Index Merge scan using the (goodkey1 < 10 OR goodkey2 < 20) condition. A range scan using the badkey < 30 condition. However, the optimizer considers only the second plan. If that is not what you want, you can make the optimizer consider Index Merge by using IGNORE INDEX or FORCE INDEX. >>
[16 Jul 2009 19:21]
Igor Babaev
Valeriy, For the test case you provided all rows have c4=0. So the optimizer choice for this case looks right. Probably you wanted to have several rows (10-20) with c4 != 0? Igor.
[17 Jul 2009 6:58]
Valeriy Kravchuk
Igor, Indeed, in my original public test case optimizer makes decision that may be as good as merge if not better. We had discussed more complex test case from csutomer later, in private comments. Maybe you are not allowed to see them now. So, I had modified my original test case a bit to show the real problem again: mysql> select version(); +--------------+ | version() | +--------------+ | 5.1.38-debug | +--------------+ 1 row in set (0.00 sec) mysql> create table tm1(c1 int auto_increment primary key, c2 char(20), c3 char -> (20), c4 int); Query OK, 0 rows affected (0.02 sec) mysql> alter table tm1 add key k1 (c2); Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> alter table tm1 add key k2 (c3); Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> alter table tm1 add key k3(c4); Query OK, 0 rows affected (0.04 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> insert into tm1 values(null, 'a', 'b', 0); Query OK, 1 row affected (0.00 sec) mysql> insert into tm1 values(null, 'c', 'b', 0); Query OK, 1 row affected (0.00 sec) mysql> insert into tm1 values(null, 'a', 'd', 0); Query OK, 1 row affected (0.00 sec) mysql> insert into tm1 values(null, 'ccc', 'qqq', 0); Query OK, 1 row affected (0.01 sec) mysql> insert into tm1 (c2,c3) select c2,c3 from tm1 where c2 != 'a'; Query OK, 2 rows affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> insert into tm1 (c2,c3) select c2,c3 from tm1 where c2 != 'a'; Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> insert into tm1 (c2,c3) select c2,c3 from tm1 where c2 != 'a'; Query OK, 8 rows affected (0.01 sec) Records: 8 Duplicates: 0 Warnings: 0 mysql> insert into tm1 (c2,c3) select c2,c3 from tm1 where c2 != 'a'; Query OK, 16 rows affected (0.01 sec) Records: 16 Duplicates: 0 Warnings: 0 mysql> insert into tm1 (c2,c3) select c2,c3 from tm1 where c2 != 'a'; Query OK, 32 rows affected (0.00 sec) Records: 32 Duplicates: 0 Warnings: 0 mysql> insert into tm1 (c2,c3) select c2,c3 from tm1 where c2 != 'a'; Query OK, 64 rows affected (0.01 sec) Records: 64 Duplicates: 0 Warnings: 0 mysql> insert into tm1 (c2,c3) select c2,c3 from tm1 where c2 != 'a'; Query OK, 128 rows affected (0.04 sec) Records: 128 Duplicates: 0 Warnings: 0 This is where I stopped initially. Now, as you suggested, I am adding some rows where c4 != 0, a lot of them actually: mysql> insert into tm1 (c2,c3,c4) select c2,c3,1 from tm1 where c2 != 'a'; Query OK, 256 rows affected (0.07 sec) Records: 256 Duplicates: 0 Warnings: 0 mysql> insert into tm1 (c2,c3,c4) select c2,c3,2 from tm1 where c2 != 'a'; Query OK, 512 rows affected (0.09 sec) Records: 512 Duplicates: 0 Warnings: 0 mysql> insert into tm1 (c2,c3,c4) select c2,c3,3 from tm1 where c2 != 'a'; Query OK, 1024 rows affected (0.16 sec) Records: 1024 Duplicates: 0 Warnings: 0 mysql> insert into tm1 (c2,c3,c4) select c2,c3,4 from tm1 where c2 != 'a'; Query OK, 2048 rows affected (0.36 sec) Records: 2048 Duplicates: 0 Warnings: 0 mysql> select count(*) from tm1 where (c2='e' OR c3='q'); +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from tm1 where c4 != 0; +----------+ | count(*) | +----------+ | 3840 | +----------+ 1 row in set (0.02 sec) With smaller number of rows with c4 != 0 optimizer may choose a rangle scan on k3, but it istill may be worse that index merge. But this is extreme case: mysql> explain select distinct c1 from tm1 where (c4 != 0) AND (c2='e' OR c3='q'); +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | tm1 | ALL | k1,k2,k3 | NULL | NULL | NULL | 4098 | Using where | +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ 1 row in set (0.00 sec) mysql> explain select distinct c1 from tm1 force index(k1,k2) where (c4 != 0) AND (c2='e' OR c3='q'); +----+-------------+-------+-------------+---------------+-------+---------+------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------------+---------------+-------+---------+------+------+---------------------------------+ | 1 | SIMPLE | tm1 | index_merge | k1,k2 | k1,k2 | 21,21 | NULL | 2 | Using union(k1,k2); Using where | +----+-------------+-------+-------------+---------------+-------+---------+------+------+---------------------------------+ 1 row in set (0.00 sec) As you can see, because estimated number of rows for range scan is too big, optimizer just prefers ALL over range, but still ignores muuch more eeficient index merge: mysql> show session status like 'Handler_read%'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Handler_read_first | 0 | | Handler_read_key | 21 | | Handler_read_next | 3879 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 8826 | +-----------------------+-------+ 6 rows in set (0.00 sec) mysql> select distinct c1 from tm1 force index(k1,k2) where (c4 != 0) AND (c2='e' OR c3='q'); Empty set (0.01 sec) mysql> show session status like 'Handler_read%'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Handler_read_first | 0 | | Handler_read_key | 23 | | Handler_read_next | 3879 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 8826 | +-----------------------+-------+ 6 rows in set (0.00 sec) mysql> select distinct c1 from tm1 where (c4 != 0) AND (c2='e' OR c3='q'); Empty set (0.01 sec) mysql> show session status like 'Handler_read%'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Handler_read_first | 0 | | Handler_read_key | 23 | | Handler_read_next | 3879 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 12925 | +-----------------------+-------+ 6 rows in set (0.01 sec) This is what happned to customer. And this is the problem, as full table scan (or biug range scan) will become thousands of times slower eventually in some cases than index_merge.
[31 Aug 2012 11:29]
Jørgen Løland
Duplicate of BUG#30151, fixed in 5.6.6