| Bug #76029 | IN / NOT IN doesn't return correct result when on-disk temporary table is used | ||
|---|---|---|---|
| Submitted: | 24 Feb 2015 14:26 | Modified: | 24 Feb 2015 15:50 |
| Reporter: | Ed - | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 5.6.23 | OS: | Linux |
| Assigned to: | CPU Architecture: | Any | |
[24 Feb 2015 14:27]
Ed -
example tables
Attachment: tables.sql.tar.xz (application/x-xz, text), 127.67 KiB.
[24 Feb 2015 14:49]
Ed -
I should also add that increasing tmp_table_size and max_heap_table_size from the default 16M avoid the bug by avoiding on-disk temp tables
[24 Feb 2015 15:50]
MySQL Verification Team
Thank you for the bug report. Only 5.6 affected:
mysql 5.5 > SELECT *
-> FROM a
-> WHERE
-> supplier_code = 'test'
-> AND code = 'test'
-> AND uom = 'test'
-> AND (supplier_code, code, uom) NOT IN (
-> SELECT supplier_code, code, uom
-> FROM b
-> );
Empty set (0.00 sec)
mysql 5.5 > SHOW VARIABLES LIKE "%VERSION%";
+-------------------------+--------------------------------------+
| Variable_name | Value |
+-------------------------+--------------------------------------+
| innodb_version | 5.5.43 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 5.5.43-log |
| version_comment | Source distribution pull: 2015.feb20 |
| version_compile_machine | AMD64 |
| version_compile_os | Win64 |
+-------------------------+--------------------------------------+
7 rows in set (0.00 sec)
mysql 5.6 > SELECT *
-> FROM a
-> WHERE
-> supplier_code = 'test'
-> AND code = 'test'
-> AND uom = 'test'
-> AND (supplier_code, code, uom) NOT IN (
-> SELECT supplier_code, code, uom
-> FROM b
-> );
+---------------+------+------+
| supplier_code | code | uom |
+---------------+------+------+
| test | test | test |
+---------------+------+------+
1 row in set (0.13 sec)
mysql 5.6 > SHOW VARIABLES LIKE "%VERSION%";
+-------------------------+--------------------------------------+
| Variable_name | Value |
+-------------------------+--------------------------------------+
| innodb_version | 5.6.24 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 5.6.24 |
| version_comment | Source distribution pull: 2015.feb20 |
| version_compile_machine | x86_64 |
| version_compile_os | Win64 |
+-------------------------+--------------------------------------+
7 rows in set (0.00 sec)
mysql 5.7 > SELECT *
-> FROM a
-> WHERE
-> supplier_code = 'test'
-> AND code = 'test'
-> AND uom = 'test'
-> AND (supplier_code, code, uom) NOT IN (
-> SELECT supplier_code, code, uom
-> FROM b
-> );
Empty set (0.13 sec)
mysql 5.7 > SHOW VARIABLES LIKE "%VERSION%";
+-------------------------+--------------------------------------+
| Variable_name | Value |
+-------------------------+--------------------------------------+
| innodb_version | 5.7.7 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 5.7.7-rc |
| version_comment | Source distribution pull: 2015.feb20 |
| version_compile_machine | x86_64 |
| version_compile_os | Win64 |
+-------------------------+--------------------------------------+
7 rows in set (0.00 sec)

Description: NOT / IN subqueries don't return the correct result when materialization is used for the subquery and an on-disk temporary table is used. An older 5.6.10 server returns the correct result but executes the subquery as a dependent subquery. How to repeat: Create table a and b with the supplied script. SELECT * FROM a WHERE supplier_code = 'test' AND code = 'test' AND uom = 'test'; returns 1 record SELECT * FROM b WHERE supplier_code = 'test' AND code = 'test' AND uom = 'test'; returns 1 record SELECT * FROM a WHERE supplier_code = 'test' AND code = 'test' AND uom = 'test' AND (supplier_code, code, uom) NOT IN ( SELECT supplier_code, code, uom FROM b ); returns 1 record when it should return none. Changing NOT IN to IN returns 0 records. If the length of the columns in table b are reduced so their total length is <= 752 characters then the results are correct. This appears to be because the server is executing the query using an in-memory temp table instead of on-disk. If the number of records in table B is reduced enough then the result is also correct.