Bug #22855 | Optimizer doesn't rewrite NOT IN subselects to a correlated subquery | ||
---|---|---|---|
Submitted: | 30 Sep 2006 9:36 | Modified: | 18 Jun 2007 16:02 |
Reporter: | [ name withheld ] | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 4.1.13, 4.1.22BK, 5.0.40-BK, 5.1.18-beta-BK | OS: | Linux (SUSE Linux 10.0, WinXP) |
Assigned to: | Georgi Kodinov | CPU Architecture: | Any |
[30 Sep 2006 9:36]
[ name withheld ]
[30 Sep 2006 12:12]
Tonci Grgin
Hi and thanks for your bug report. I was able to verify it on latest 4.1.22BK. Please use one of the provided (and documented) workarounds. Primary keys have nothing to with this. By dropping them you are directing optimizer on different execution path (table scan). Your MySQL connection id is 1 to server version: 4.1.22-log Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> create table t1 ( course_id int(10) not null, primary key (course_id)); Query OK, 0 rows affected (0.03 sec) mysql> create table t2 ( course_id int(10) unsigned not null, primary key (cours e_id)); Query OK, 0 rows affected (0.00 sec) mysql> insert into t1 values (1),(-1),(-2),(2),(3),(4),(5),(6),(7),(8),(65),(66) ,(67),(68); Query OK, 14 rows affected (0.00 sec) Records: 14 Duplicates: 0 Warnings: 0 mysql> insert into t2 values (65),(66),(67); Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> select distinct course_id from t1 where course_id not in (65,66,67); +-----------+ | course_id | +-----------+ | -2 | | -1 | | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 68 | +-----------+ 11 rows in set (0.00 sec) mysql> select distinct course_id from t1 where course_id not in (SELECT course_id FROM t2); Empty set (0.00 sec) Workaround 1 (rewriting as left join, preferred way): mysql> select distinct t1.course_id from t1 left join t2 on t1.course_id=t2.course_id where t2.course_id is null; +-----------+ | course_id | +-----------+ | -2 | | -1 | | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 68 | +-----------+ 11 rows in set (0.00 sec) Workaround 2 (working instead of optimizer, writing correlated subquery): mysql> select distinct t1.course_id from t1 -> where not exists ( -> select t2.course_id from t2 -> where t2.course_id=t1.course_id -> ); +-----------+ | course_id | +-----------+ | -2 | | -1 | | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 68 | +-----------+ 11 rows in set (0.00 sec) mysql> Related manual pages: http://dev.mysql.com/doc/refman/4.1/en/rewriting-subqueries.html http://dev.mysql.com/doc/refman/4.1/en/subquery-restrictions.html ... The problem is that, for a statement that uses an IN subquery, the optimizer rewrites it as a correlated subquery. Consider the following statement that uses an uncorrelated subquery: SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2); The optimizer rewrites the statement to a correlated subquery: SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a); ...
[11 Apr 2007 15:58]
Georgi Kodinov
Recreated with 5.0.40-BK and 5.1.18-beta-BK. I don't see the original problem anymore (i.e. the empty set whe using NOT IN (SELECT ...)). But I see missing rows (the negative rows). Here's the reduced test case: + CREATE TABLE t1 (a INT NOT NULL); + INSERT INTO t1 VALUES (1),(-1), (65),(66); + CREATE TABLE t2 (a INT UNSIGNED NOT NULL PRIMARY KEY); + INSERT INTO t2 VALUES (65),(66); + SELECT a FROM t1 WHERE a NOT IN (65,66); + a + 1 + -1 + SELECT a FROM t1 WHERE a NOT IN (SELECT a FROM t2); + a + 1 + SELECT a FROM t1 WHERE NOT EXISTS (SELECT a FROM t2 WHERE t2.a = t1.a); + a + 1 + -1 + EXPLAIN + SELECT a FROM t1 WHERE a NOT IN (SELECT a FROM t2); + id select_type table type possible_keys key key_len ref rowExtra + 1 PRIMARY t1 ALL NULL NULL NULL NULL 4 Using where + 2 DEPENDENT SUBQUERY t2 unique_subquery PRIMARY PRIMARY 4 funUsing index + EXPLAIN + SELECT a FROM t1 WHERE NOT EXISTS (SELECT a FROM t2 WHERE t2.a = t1.a); + id select_type table type possible_keys key key_len ref rowExtra + 1 PRIMARY t1 ALL NULL NULL NULL NULL 4 Using where + 2 DEPENDENT SUBQUERY t2 eq_ref PRIMARY PRIMARY 4 t1.a 1 Using where; Using index
[12 Apr 2007 15:18]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/24393 ChangeSet@1.2445, 2007-04-12 18:17:38+03:00, gkodinov@macbook.gmz +3 -0 Bug#22855: Corrected the handling of conversion errors and NULL key values in an IN predicate that is resolved by unique_subquery (through subselect_uniquesubquery_engine). See code comments for further explanations.
[16 May 2007 10:06]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/26799 ChangeSet@1.2445, 2007-05-16 13:05:27+03:00, gkodinov@macbook.gmz +3 -0 Bug#22855: Conversion errors when constructing the condition for an IN predicates were treated as if the affected column contains NULL. In such cases the IN predicate returns NULL. This is not a problem in the IN predicate is a top-level predicate (where NULL is treated as FALSE), but causes an enclosing NOT predicate to also return NULL. And since this means FALSE if the NOT predicate is top level we get wrong results. Corrected the handling of conversion errors and NULL key values in an IN predicate that is resolved by unique_subquery (through subselect_uniquesubquery_engine). See code comments for further explanations.
[17 May 2007 16:39]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/26916 ChangeSet@1.2445, 2007-05-17 19:38:34+03:00, gkodinov@macbook.gmz +3 -0 Bug#22855: Conversion errors when constructing the condition for an IN predicates were treated as if the affected column contains NULL. If such a IN predicate is inside NOT we get wrong results. Corrected the handling of conversion errors in an IN predicate that is resolved by unique_subquery (through subselect_uniquesubquery_engine).
[4 Jun 2007 21:20]
Bugs System
Pushed into 5.1.20-beta
[4 Jun 2007 21:22]
Bugs System
Pushed into 5.0.44
[18 Jun 2007 16:02]
Paul DuBois
Noted in 5.0.44, 5.1.20 changelogs.