Bug #48177 | SELECTs with NOT IN subqueries containing NULL values return too many records | ||
---|---|---|---|
Submitted: | 20 Oct 2009 9:02 | Modified: | 18 Dec 2009 13:18 |
Reporter: | Jørgen Løland | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
Version: | 5.0.88, 5.1, 6.0 | OS: | Any |
Assigned to: | Jørgen Løland | CPU Architecture: | Any |
[20 Oct 2009 9:02]
Jørgen Løland
[20 Oct 2009 9:06]
Jørgen Løland
This bug is probably the root problem of BUG#47694
[20 Oct 2009 9:14]
Valeriy Kravchuk
Verified just as described, also - with 5.0.88: 77-52-242-160:5.0 openxs$ bin/mysql -uroot test Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 Server version: 5.0.88-debug Source distribution Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> drop table t1; Query OK, 0 rows affected (0.08 sec) mysql> drop table t2; Query OK, 0 rows affected (0.03 sec) mysql> CREATE TABLE t1 ( -> pk int NOT NULL, -> i1_key int DEFAULT NULL, -> i1_nokey int DEFAULT NULL, -> i2_nokey int DEFAULT NULL, -> v varchar(1) DEFAULT NULL, -> PRIMARY KEY (pk), -> KEY (i1_key) -> ) ; Query OK, 0 rows affected (0.00 sec) mysql> mysql> INSERT INTO t1 VALUES (3, NULL, NULL, 1, 'a'), -> (4, 3, 3, 2, 'b'), -> (6, 5, 5, 3, 'c'), -> (7, NULL, NULL, 4, 'b'), -> (8, 1, 1, 5, 'd'), -> (9, 2, 2, 6, 'e'); Query OK, 6 rows affected (0.00 sec) Records: 6 Duplicates: 0 Warnings: 0 mysql> mysql> CREATE TABLE t2 ( -> pk int NOT NULL, -> i1_key int DEFAULT NULL, -> i2_key int DEFAULT NULL, -> i2_nokey int DEFAULT NULL, -> PRIMARY KEY (pk), -> KEY (i1_key), -> KEY (i2_key) -> ) ; Query OK, 0 rows affected (0.00 sec) mysql> mysql> INSERT INTO t2 VALUES (1, NULL, 3, 3), -> (2, NULL, 0, 0), -> (5, NULL, 4, 4), -> (7, 0, 5, 5), -> (8, 2, 6, 6), -> (9, 8, 3, 3); Query OK, 6 rows affected (0.00 sec) Records: 6 Duplicates: 0 Warnings: 0 mysql> mysql> SELECT i2_nokey, i1_nokey, i1_key, v -> FROM t1 -> WHERE (i1_nokey, i2_nokey) -> NOT IN (SELECT i1_key, i2_nokey FROM t2) -> ORDER BY pk; +----------+----------+--------+------+ | i2_nokey | i1_nokey | i1_key | v | +----------+----------+--------+------+ | 1 | NULL | NULL | a | | 2 | 3 | 3 | b | | 4 | NULL | NULL | b | | 5 | 1 | 1 | d | +----------+----------+--------+------+ 4 rows in set (0.02 sec) mysql> SELECT i2_nokey, i1_nokey, i1_key, v -> FROM t1 -> WHERE (i1_nokey) -> NOT IN (SELECT i1_key FROM t2 WHERE t1.i2_nokey=t2.i2_nokey) -> ORDER BY pk; +----------+----------+--------+------+ | i2_nokey | i1_nokey | i1_key | v | +----------+----------+--------+------+ | 1 | NULL | NULL | a | | 2 | 3 | 3 | b | | 5 | 1 | 1 | d | +----------+----------+--------+------+ 3 rows in set (0.00 sec)
[21 Oct 2009 13:38]
Jørgen Løland
Simpler-to-read explain output from mysql-6.0-codebase with semijoin and materialization off: EXPLAIN EXTENDED SELECT i2_nokey, i1_nokey, i1_key, v FROM t1 WHERE (i1_nokey, i2_nokey) NOT IN (SELECT i1_key, i2_nokey FROM t2); id sel_type table type pos_keys key ref rows fltr Extra 1 PRIMARY t1 ALL NULL NULL NULL 6 100 Using where 2 DEP SQ t2 idx_subq i1_key i1_key func 2 100 Using where; Full scan on NULL key Warnings: Note 1003 SELECT `i2_nokey`, `i1_nokey`, `i1_key`, `v` FROM `t1` WHERE ( NOT( <in_optimizer>((`t1`.`i1_nokey`,`t1`.`i2_nokey`), <exists>( <index_lookup>( <cache>(`t1`.`i1_nokey`) IN t2 on i1_key checking NULL where ( trigcond( ((<cache>(`t1`.`i1_nokey`) = `t2`.`i1_key`) or isnull(`t2`.`i1_key`))) and trigcond( ((<cache>(`t1`.`i2_nokey`) = `t2`.`i2_nokey`) or isnull(`t2`.`i2_nokey`))) ) having ( trigcond( <is_not_null_test>(`t2`.`i1_key`)) and trigcond( <is_not_null_test>(`t2`.`i2_nokey`))))))) )
[21 Oct 2009 14:01]
Jørgen Løland
In another record is inserted into t1 like this: # Data in t1 SELECT i2_nokey, i1_nokey, i1_key, v FROM t1 ORDER BY pk; i2_nokey i1_nokey i1_key v 1 NULL NULL a 2 3 3 b 3 5 5 c 4 NULL NULL b 4 0 0 b <----- 5 1 1 d 6 2 2 e The new record is correctly filtered out (not returned). It seems like the problem is only present for records that have a NULL value on the left hand side of IN.
[22 Oct 2009 9:35]
Jørgen Løland
Another strange observation: The bug is only repeatable if there is another record in t1 with a NULL value that qualifies for the result set: (1) Removing record [1,NULL,NULL,a] from t1 makes the incorrectly returned [4,NULL,NULL,b] record go away. (2) Inserting record [1,1] into t2 removes *both* the [1,NULL...] and [4,NULL...] records from the result of the query. In other words: Data in t1 SELECT i2_nokey, i1_nokey, i1_key, v FROM t1 ORDER BY i2_nokey, pk; i2_nokey i1_nokey i1_key v 1 NULL NULL a <----- (1) Delete this record 2 3 3 b 3 5 5 c 4 NULL NULL b 5 1 1 d or Data in subquery (should be filtered out) SELECT i2_nokey, i1_key FROM t2 ORDER BY i2_nokey; i2_nokey i1_key 0 NULL 1 1 <------ (2) Insert this record 3 NULL 3 8 4 NULL 5 0 6 2
[22 Oct 2009 12:02]
Jørgen Løland
Reduced the test to: Data in t1 SELECT i2_nokey, i1_nokey, i1_key, v FROM t1; i2_nokey i1_nokey i1_key v 1 NULL NULL a 2 3 3 b 4 NULL NULL b 4 0 0 b Data in subquery (should be filtered out) SELECT i2_nokey, i1_key FROM t2 ORDER BY i2_nokey; i2_nokey i1_key 4 NULL SELECT i2_nokey, i1_nokey, i1_key, v FROM t1 WHERE (i1_nokey, i2_nokey) NOT IN (SELECT i1_key, i2_nokey FROM t2); i2_nokey i1_nokey i1_key v 1 NULL NULL a 2 3 3 b 4 NULL NULL b <------ WRONG However, if we change the order records are *inserted* into t1, neither [1,NULL,...] or [4,NULL,...] appear in the result set: Data in t1 SELECT i2_nokey, i1_nokey, i1_key, v FROM t1; i2_nokey i1_nokey i1_key v 4 NULL NULL b 1 NULL NULL a 2 3 3 b 4 0 0 b Data in subquery (should be filtered out) SELECT i2_nokey, i1_key FROM t2 ORDER BY i2_nokey; i2_nokey i1_key 4 NULL Actual result SELECT i2_nokey, i1_nokey, i1_key, v FROM t1 WHERE (i1_nokey, i2_nokey) NOT IN (SELECT i1_key, i2_nokey FROM t2); i2_nokey i1_nokey i1_key v 2 3 3 b <----- WRONG - [1,NULL,...] should be here!
[26 Oct 2009 11:46]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/88133 3180 Jorgen Loland 2009-10-26 Bug#48177 - SELECTs with NOT IN subqueries containing NULL values return too many records WHERE clauses with "outer_value_list NOT IN subselect" were handled incorrectly if the outer value list contained multiple items where at least one of these could be NULL. The first outer record with NULL value was handled correctly, but if a second record with NULL value existed, the optimizer would choose to reuse the result it got on the last execution of the subselect. This is incorrect if the outer value list has multiple items. The fix is to make Item_in_optimizer::val_int (in item_cmpfunc.cc) reuse the result of the latest execution for NULL values only if all values in the outer_value_list are NULL. @ mysql-test/r/subselect3.result Added test for BUG#48177 @ mysql-test/t/subselect3.test Added test for BUG#48177 @ sql/item_cmpfunc.cc Make Item_in_optimizer::val_int (in item_cmpfunc.cc) reuse the result of the latest execution for NULL values only if all values in the outer_value_list are NULL.
[30 Oct 2009 12:10]
Øystein Grøvlen
Approved with minor comments.
[3 Nov 2009 12:48]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/89127 3162 Jorgen Loland 2009-11-03 Bug#48177 - SELECTs with NOT IN subqueries containing NULL values return too many records WHERE clauses with "outer_value_list NOT IN subselect" were handled incorrectly if the outer value list contained multiple items where at least one of these could be NULL. The first outer record with NULL value was handled correctly, but if a second record with NULL value existed, the optimizer would choose to reuse the result it got on the last execution of the subselect. This is incorrect if the outer value list has multiple items. The fix is to make Item_in_optimizer::val_int (in item_cmpfunc.cc) reuse the result of the latest execution for NULL values only if all values in the outer_value_list are NULL. @ mysql-test/r/subselect3.result Added test for BUG#48177 @ mysql-test/t/subselect3.test Added test for BUG#48177 @ sql/item_cmpfunc.cc Make Item_in_optimizer::val_int (in item_cmpfunc.cc) reuse the result of the latest execution for NULL values only if all values in the outer_value_list are NULL.
[4 Nov 2009 9:24]
Bugs System
Pushed into 5.1.41 (revid:joro@sun.com-20091104092152-qz96bzlf2o1japwc) (version source revid:kristofer.pettersson@sun.com-20091103162305-08l4gkeuif2ozsoj) (merge vers: 5.1.41) (pib:13)
[11 Nov 2009 6:49]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091110093407-rw5g8dys2baqkt67) (version source revid:alik@sun.com-20091109080109-7dxapd5y5pxlu08w) (merge vers: 6.0.14-alpha) (pib:13)
[11 Nov 2009 6:56]
Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091109115615-nuohp02h8mdrz8m2) (version source revid:alik@sun.com-20091105121316-hgdduu5vqdpbawf8) (merge vers: 5.5.0-beta) (pib:13)
[18 Nov 2009 16:42]
Paul DuBois
Noted in 5.1.41, 5.5.0, 6.0.14 changelogs. WHERE clauses with "outer_value_list NOT IN subquery" were handled incorrectly if the outer value list contained multiple items at least one of which could be NULL.
[18 Dec 2009 10:30]
Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6) (version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers: 5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 10:45]
Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt) (version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers: 5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 11:01]
Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0) (version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers: 5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 11:15]
Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606) (version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers: 5.1.41-ndb-7.0.11) (pib:15)
[18 Dec 2009 13:18]
MC Brown
Already documented in 5.1.41