Bug #25407 | MySQL prefers UNIQUE index for IS NULL conditions over more selective conditions | ||
---|---|---|---|
Submitted: | 4 Jan 2007 10:31 | Modified: | 8 Feb 2007 18:09 |
Reporter: | Kai Ruhnau | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 5.0.34-BK, 5.0.26 | OS: | Linux (Gentoo Linux) |
Assigned to: | Igor Babaev | CPU Architecture: | Any |
[4 Jan 2007 10:31]
Kai Ruhnau
[4 Jan 2007 12:15]
Valeriy Kravchuk
Thank you for a bug report. Verified just as dfescribed with latest 5.0.34-BK on Linux: mysql> select version(); +--------------+ | version() | +--------------+ | 5.0.34-debug | +--------------+ 1 row in set (0.00 sec) mysql> CREATE TABLE my_table ( -> ID_with_null INT(10) NULL DEFAULT NULL, -> ID_better INT(10) NOT NULL, -> UNIQUE INDEX (ID_with_null), -> INDEX (ID_better) -> ); Query OK, 0 rows affected (0.02 sec) mysql> insert into my_table values(1, 1); Query OK, 1 row affected (0.00 sec) mysql> insert into my_table values(2, 1); Query OK, 1 row affected (0.00 sec) mysql> insert into my_table values(NULL, 3); Query OK, 1 row affected (0.00 sec) mysql> insert into my_table select NULL, 3 from my_table; Query OK, 3 rows affected (0.02 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> insert into my_table select * from my_table where ID_with_null IS NULL; Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> insert into my_table select * from my_table where ID_with_null IS NULL; Query OK, 8 rows affected (0.01 sec) Records: 8 Duplicates: 0 Warnings: 0 ... mysql> insert into my_table select * from my_table where ID_with_null IS NULL; Query OK, 16384 rows affected (0.91 sec) Records: 16384 Duplicates: 0 Warnings: 0 mysql> analyze table my_table; +---------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +---------------+---------+----------+----------+ | test.my_table | analyze | status | OK | +---------------+---------+----------+----------+ 1 row in set (0.03 sec) mysql> SELECT COUNT(*) FROM my_table WHERE ID_with_null IS NULL; +----------+ | COUNT(*) | +----------+ | 32768 | +----------+ 1 row in set (0.06 sec) mysql> SELECT COUNT(*) FROM my_table WHERE ID_better=1; +----------+ | COUNT(*) | +----------+ | 2 | +----------+ 1 row in set (0.00 sec) mysql> EXPLAIN SELECT * FROM my_table WHERE ID_better=1 AND ID_with_null IS NUL L; +----+-------------+----------+------+------------------------+--------------+-- -------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | k ey_len | ref | rows | Extra | +----+-------------+----------+------+------------------------+--------------+-- -------+-------+------+-------------+ | 1 | SIMPLE | my_table | ref | ID_with_null,ID_better | ID_with_null | 5 | const | 1 | Using where | +----+-------------+----------+------+------------------------+--------------+-- -------+-------+------+-------------+ 1 row in set (0.00 sec) mysql> SELECT * FROM my_table WHERE ID_better=1 AND ID_with_null IS NULL; Empty set (0.18 sec) mysql> SELECT * FROM my_table FORCE INDEX (ID_better) WHERE ID_better=1 AND ID_ with_null IS NULL; Empty set (0.01 sec) So, optimizer's decision to use that UNIQUE index is obviously non-optimal.
[31 Jan 2007 2:24]
Igor Babaev
We have a wrong choice of indexes in the reported test case due to a wrong estimate for the number of rows with ID_with_null=NULL (what we can we see from the EXPLAIN output). This wrong estimate, in its turn, is the result of an assumption that any key may occur in an unique index only once, which is not true for keys with nulls that can have multiple occurrences in the index. This bug is present in 4.0/4.1 code as well. However, it won't be fixed there as the fix affects the choice of the execution plan for some queries.
[31 Jan 2007 3:27]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/19059 ChangeSet@1.2392, 2007-01-30 19:30:08-08:00, igor@olga.mysql.com +5 -0 Fixed bug #25407. The bug coud cause choosing a sub-optimal execution plan for a single-table query if a unuque index with many null keys were defined for the table. It happened because the code of the check_quick_keys function made an assumption that any key may occur in an unique index only once. Yet this is not true for keys with nulls that may have multiple occurrences in the index.
[1 Feb 2007 1:09]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/19148 ChangeSet@1.2396, 2007-01-31 17:12:16-08:00, igor@olga.mysql.com +4 -0 Fixed bug #25407. The bug could cause choosing a sub-optimal execution plan for a single-table query if a unique index with many null keys were defined for the table. It happened because the code of the check_quick_keys function made an assumption that any key may occur in an unique index only once. Yet this is not true for keys with nulls that may have multiple occurrences in the index.
[1 Feb 2007 3:28]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/19151 ChangeSet@1.2396, 2007-01-31 19:31:36-08:00, igor@olga.mysql.com +4 -0 Fixed bug #25407. The bug could cause choosing a sub-optimal execution plan for a single-table query if a unique index with many null keys were defined for the table. It happened because the code of the check_quick_keys function made an assumption that any key may occur in an unique index only once. Yet this is not true for keys with nulls that may have multiple occurrences in the index.
[3 Feb 2007 6:14]
Igor Babaev
The fix has been pushed into 5.0.36, 5.1.16-beta main trees.
[8 Feb 2007 18:09]
Paul DuBois
Noted in 5.0.36, 5.1.16 changelogs. For a UNIQUE index containing many NULL values, the optimizer would prefer the index for col IS NULL conditions over other more selective indexes.
[6 Mar 2007 11:25]
Jochen Riehm
How does this cooperate with the setting of myisam_stats_method 'ignore_null'? Is feature request Bug 19327 fixed with this Bug as well, since the optimizer needs to count NULL values separately to not choose the mentioned index?