Bug #77006 5.6 incompatible change in DELETE/SELECT ... WHERE NOT IN (subquery)
Submitted: 11 May 2015 21:03 Modified: 22 Dec 2015 10:12
Reporter: Sveta Smirnova (OCA) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.6.24, 5.6.26, 5.6.10, 5.7.8 OS:Any
Assigned to: CPU Architecture:Any
Tags: regression

[11 May 2015 21:03] Sveta Smirnova
Description:
MySQL 5.6 handles subquery differently than 5.5. In 5.5 it returns rows and in 5.6 it does not for SELECT query and deletes/does not delete for DELETE query.

This is not same as bug #75967, because SET SESSION optimizer_switch = "materialization=off"; does not help

Version 5.5:

SELECT * FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
EmailAddress    ContactId       Id
someaddress@example.com 993     3363

Version 5.6:

SELECT * FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
EmailAddress    ContactId       Id

Workaround:

Add one more subquery.

SELECT * FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress) b);

DELETE FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress) b);

How to repeat:
create database if not exists percona;
use percona;

drop table if exists ContactEmail;

CREATE TABLE `ContactEmail` (
  `EmailAddress` varchar(255) NOT NULL DEFAULT '',
  `ContactId` int(11) NOT NULL DEFAULT '0',
  `Id` int(10) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`Id`),
  KEY `EmailAddress` (`EmailAddress`),
  KEY `ContactId` (`ContactId`)
);

insert into ContactEmail values ('someaddress@example.com',993,551);
insert into ContactEmail values ('someaddress@example.com',993,3363);

select "before delete:";
select * from ContactEmail;

SELECT * FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);

DELETE FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);

select "after delete:";
select * from ContactEmail;

drop table ContactEmail;
[12 May 2015 8:52] MySQL Verification Team
Hello Sveta,

Thank you for the report and test case.
Observed this with 5.6/5.7 latest builds.

Thanks,
Umesh
[12 May 2015 8:52] MySQL Verification Team
// 5.1.75

mysql> show variables like '%version%';
+-------------------------+---------------------+
| Variable_name           | Value               |
+-------------------------+---------------------+
| protocol_version        | 10                  |
| version                 | 5.1.75              |
| version_comment         | Source distribution |
| version_compile_machine | x86_64              |
| version_compile_os      | unknown-linux-gnu   |
+-------------------------+---------------------+
5 rows in set (0.00 sec)

mysql> create database if not exists percona;
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> use percona;
Database changed
mysql> drop table if exists ContactEmail;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql>
mysql> CREATE TABLE `ContactEmail` (
    ->   `EmailAddress` varchar(255) NOT NULL DEFAULT '',
    ->   `ContactId` int(11) NOT NULL DEFAULT '0',
    ->   `Id` int(10) NOT NULL AUTO_INCREMENT,
    ->   PRIMARY KEY (`Id`),
    ->   KEY `EmailAddress` (`EmailAddress`),
    ->   KEY `ContactId` (`ContactId`)
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> insert into ContactEmail values ('someaddress@example.com',993,551);
Query OK, 1 row affected (0.00 sec)

mysql> insert into ContactEmail values ('someaddress@example.com',993,3363);
Query OK, 1 row affected (0.00 sec)

mysql> select "before delete:";
+----------------+
| before delete: |
+----------------+
| before delete: |
+----------------+
1 row in set (0.00 sec)

mysql> select * from ContactEmail;
+-------------------------+-----------+------+
| EmailAddress            | ContactId | Id   |
+-------------------------+-----------+------+
| someaddress@example.com |       993 |  551 |
| someaddress@example.com |       993 | 3363 |
+-------------------------+-----------+------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
+-------------------------+-----------+------+
| EmailAddress            | ContactId | Id   |
+-------------------------+-----------+------+
| someaddress@example.com |       993 | 3363 |
+-------------------------+-----------+------+
1 row in set (0.00 sec)

mysql> DELETE FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
Query OK, 1 row affected (0.00 sec)

mysql> select "after delete:";
+---------------+
| after delete: |
+---------------+
| after delete: |
+---------------+
1 row in set (0.00 sec)

mysql> select * from ContactEmail;
+-------------------------+-----------+-----+
| EmailAddress            | ContactId | Id  |
+-------------------------+-----------+-----+
| someaddress@example.com |       993 | 551 |
+-------------------------+-----------+-----+
1 row in set (0.00 sec)

// 5.5.45

mysql> show variables like '%version%';
+-------------------------+---------------------------------------------------------+
| Variable_name           | Value                                                   |
+-------------------------+---------------------------------------------------------+
| innodb_version          | 5.5.45                                                  |
| protocol_version        | 10                                                      |
| slave_type_conversions  |                                                         |
| version                 | 5.5.45-enterprise-commercial-advanced                   |
| version_comment         | MySQL Enterprise Server - Advanced Edition (Commercial) |
| version_compile_machine | x86_64                                                  |
| version_compile_os      | linux2.6                                                |
+-------------------------+---------------------------------------------------------+
7 rows in set (0.00 sec)

mysql> create database if not exists percona;
Query OK, 1 row affected (0.00 sec)

mysql> use percona;
Database changed
mysql>
mysql> drop table if exists ContactEmail;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql>
mysql> CREATE TABLE `ContactEmail` (
    ->   `EmailAddress` varchar(255) NOT NULL DEFAULT '',
    ->   `ContactId` int(11) NOT NULL DEFAULT '0',
    ->   `Id` int(10) NOT NULL AUTO_INCREMENT,
    ->   PRIMARY KEY (`Id`),
    ->   KEY `EmailAddress` (`EmailAddress`),
    ->   KEY `ContactId` (`ContactId`)
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> insert into ContactEmail values ('someaddress@example.com',993,551);
Query OK, 1 row affected (0.00 sec)

mysql> insert into ContactEmail values ('someaddress@example.com',993,3363);
Query OK, 1 row affected (0.00 sec)

mysql> select "before delete:";
+----------------+
| before delete: |
+----------------+
| before delete: |
+----------------+
1 row in set (0.00 sec)

mysql> select * from ContactEmail;
+-------------------------+-----------+------+
| EmailAddress            | ContactId | Id   |
+-------------------------+-----------+------+
| someaddress@example.com |       993 |  551 |
| someaddress@example.com |       993 | 3363 |
+-------------------------+-----------+------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
+-------------------------+-----------+------+
| EmailAddress            | ContactId | Id   |
+-------------------------+-----------+------+
| someaddress@example.com |       993 | 3363 |
+-------------------------+-----------+------+
1 row in set (0.00 sec)

mysql> DELETE FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
Query OK, 1 row affected (0.01 sec)

mysql> select "after delete:";
+---------------+
| after delete: |
+---------------+
| after delete: |
+---------------+
1 row in set (0.00 sec)

mysql> select * from ContactEmail;
+-------------------------+-----------+-----+
| EmailAddress            | ContactId | Id  |
+-------------------------+-----------+-----+
| someaddress@example.com |       993 | 551 |
+-------------------------+-----------+-----+
1 row in set (0.00 sec)
[12 May 2015 8:53] MySQL Verification Team
// 5.6.26 (lowest version checked 5.6.10)

mysql> show variables like '%version%';
+-------------------------+---------------------------------------------------------+
| Variable_name           | Value                                                   |
+-------------------------+---------------------------------------------------------+
| innodb_version          | 5.6.26                                                  |
| protocol_version        | 10                                                      |
| slave_type_conversions  |                                                         |
| version                 | 5.6.26-enterprise-commercial-advanced                   |
| version_comment         | MySQL Enterprise Server - Advanced Edition (Commercial) |
| version_compile_machine | x86_64                                                  |
| version_compile_os      | linux-glibc2.5                                          |
+-------------------------+---------------------------------------------------------+
7 rows in set (0.00 sec)

mysql> create database if not exists percona;
Query OK, 1 row affected (0.00 sec)

mysql> use percona;
Database changed
mysql> drop table if exists ContactEmail;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql>
mysql> CREATE TABLE `ContactEmail` (
    ->   `EmailAddress` varchar(255) NOT NULL DEFAULT '',
    ->   `ContactId` int(11) NOT NULL DEFAULT '0',
    ->   `Id` int(10) NOT NULL AUTO_INCREMENT,
    ->   PRIMARY KEY (`Id`),
    ->   KEY `EmailAddress` (`EmailAddress`),
    ->   KEY `ContactId` (`ContactId`)
    -> );
Query OK, 0 rows affected (0.01 sec)

mysql> insert into ContactEmail values ('someaddress@example.com',993,551);
Query OK, 1 row affected (0.00 sec)

mysql> insert into ContactEmail values ('someaddress@example.com',993,3363);
Query OK, 1 row affected (0.00 sec)

mysql> select "before delete:";
+----------------+
| before delete: |
+----------------+
| before delete: |
+----------------+
1 row in set (0.00 sec)

mysql> select * from ContactEmail;
+-------------------------+-----------+------+
| EmailAddress            | ContactId | Id   |
+-------------------------+-----------+------+
| someaddress@example.com |       993 |  551 |
| someaddress@example.com |       993 | 3363 |
+-------------------------+-----------+------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
Empty set (0.00 sec)

mysql> SET SESSION optimizer_switch = "materialization=off";
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
Empty set (0.00 sec)

mysql> DELETE FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
Query OK, 0 rows affected (0.00 sec)

mysql> select "after delete:";
+---------------+
| after delete: |
+---------------+
| after delete: |
+---------------+
1 row in set (0.00 sec)

mysql> select * from ContactEmail;
+-------------------------+-----------+------+
| EmailAddress            | ContactId | Id   |
+-------------------------+-----------+------+
| someaddress@example.com |       993 |  551 |
| someaddress@example.com |       993 | 3363 |
+-------------------------+-----------+------+
2 rows in set (0.00 sec)

// 5.7.8

mysql> show variables like '%version%';
+-------------------------+---------------------------------------------------------+
| Variable_name           | Value                                                   |
+-------------------------+---------------------------------------------------------+
| innodb_version          | 5.7.8                                                   |
| protocol_version        | 10                                                      |
| slave_type_conversions  |                                                         |
| version                 | 5.7.8-rc-enterprise-commercial-advanced                 |
| version_comment         | MySQL Enterprise Server - Advanced Edition (Commercial) |
| version_compile_machine | x86_64                                                  |
| version_compile_os      | linux-glibc2.5                                          |
+-------------------------+---------------------------------------------------------+
7 rows in set (0.00 sec)

mysql> create database if not exists percona;
Query OK, 1 row affected (0.00 sec)

mysql> use percona;
Database changed
mysql> drop table if exists ContactEmail;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql>
mysql> CREATE TABLE `ContactEmail` (
    ->   `EmailAddress` varchar(255) NOT NULL DEFAULT '',
    ->   `ContactId` int(11) NOT NULL DEFAULT '0',
    ->   `Id` int(10) NOT NULL AUTO_INCREMENT,
    ->   PRIMARY KEY (`Id`),
    ->   KEY `EmailAddress` (`EmailAddress`),
    ->   KEY `ContactId` (`ContactId`)
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> insert into ContactEmail values ('someaddress@example.com',993,551);
Query OK, 1 row affected (0.01 sec)

mysql> insert into ContactEmail values ('someaddress@example.com',993,3363);
Query OK, 1 row affected (0.00 sec)

mysql> select "before delete:";
+----------------+
| before delete: |
+----------------+
| before delete: |
+----------------+
1 row in set (0.00 sec)

mysql> select * from ContactEmail;
+-------------------------+-----------+------+
| EmailAddress            | ContactId | Id   |
+-------------------------+-----------+------+
| someaddress@example.com |       993 |  551 |
| someaddress@example.com |       993 | 3363 |
+-------------------------+-----------+------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
Empty set (0.00 sec)

mysql> DELETE FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
ERROR 1093 (HY000): You can't specify target table 'ContactEmail' for update in FROM clause
mysql> select "after delete:";
+---------------+
| after delete: |
+---------------+
| after delete: |
+---------------+
1 row in set (0.00 sec)

mysql> select * from ContactEmail;
+-------------------------+-----------+------+
| EmailAddress            | ContactId | Id   |
+-------------------------+-----------+------+
| someaddress@example.com |       993 |  551 |
| someaddress@example.com |       993 | 3363 |
+-------------------------+-----------+------+
2 rows in set (0.00 sec)

mysql> SET SESSION optimizer_switch = "materialization=off";
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
Empty set (0.00 sec)

mysql> DELETE FROM ContactEmail WHERE ContactId = 993 AND Id NOT IN (SELECT Id FROM (SELECT * FROM ContactEmail WHERE ContactId = 993 ORDER BY Id) a GROUP BY EmailAddress);
ERROR 1093 (HY000): You can't specify target table 'ContactEmail' for update in FROM clause
mysql>
[12 May 2015 12:23] Sveta Smirnova
Originally reported at https://bugs.launchpad.net/bugs/1453337
[22 Dec 2015 6:36] Sreeharsha Ramanavarapu
For the sake of simplicity, I will be dealing with the following query.
It is similar to the one in the bug page. NOT IN has been replaced by an IN
and I also got rid of the derived table.

CREATE TABLE `ContactEmail` (
  `EmailAddress` varchar(255) NOT NULL DEFAULT '',
  `ContactId` int(11) NOT NULL DEFAULT '0',
  `Id` int(10) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`Id`),
  KEY `EmailAddress` (`EmailAddress`),
  KEY `ContactId` (`ContactId`)
);

insert into ContactEmail values ('someaddress@example.com',993,551),
('someaddress@example.com',993,3363);

In 5.5:
mysql> SELECT *
    -> FROM ContactEmail
    -> WHERE ContactId = 993 AND
    ->       Id IN (SELECT Id
    ->              FROM ContactEmail
    ->              WHERE ContactId = 993
    ->              GROUP BY EmailAddress
    ->             );
+-------------------------+-----------+-----+
| EmailAddress            | ContactId | Id  |
+-------------------------+-----------+-----+
| someaddress@example.com |       993 | 551 |
+-------------------------+-----------+-----+
1 row in set (0.01 sec)

In 5.6+ (with full group_by mode turned off):
mysql> SELECT *
    -> FROM ContactEmail
    -> WHERE ContactId = 993 AND
    ->       Id IN (SELECT Id
    ->              FROM ContactEmail
    ->              WHERE ContactId = 993
    ->              GROUP BY EmailAddress
    ->             );
+-------------------------+-----------+------+
| EmailAddress            | ContactId | Id   |
+-------------------------+-----------+------+
| someaddress@example.com |       993 |  551 |
| someaddress@example.com |       993 | 3363 |
+-------------------------+-----------+------+
2 rows in set (0.00 sec)

A few things to notice:
1) Notice that in full group_by mode, the above query fails with the error :
"ERROR 1055 (42000): 'test.ContactEmail.Id' isn't in GROUP BY"

2) The extra row in 5.6+ is due to two reasons:
   a) Before converting the IN subquery to a semijoin, the inner query's
GROUP BY is eliminated since it is irrelevant (see WL#5953). So both rows are
delivered in 5.6.
   b) If we only use the inner query, without the group by:
      SELECT Id FROM ContactEmail WHERE ContactId = 993 ;
       +------+
       | Id   |
       +------+
       |  551 |
       | 3363 |
       +------+
     2 rows in set (0.00 sec)

There is no reason for group by to pick 551 over 3363. See the following:

http://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html
"this is useful primarily when all values in each nonaggregated column not
named in the GROUP BY are the same for each group. The server is free to
choose any value from each group, so unless they are the same, the values
chosen are indeterminate. "

There are two approaches the user can take in 5.5:
1) Use 'ONLY_FULL_GROUP_BY' to reject such queries.
2) Modify the query to the following. Adding a min/max will make sure that
only one row (deterministic) will emerge:

SELECT *
FROM ContactEmail
WHERE ContactId = 993 AND
      Id IN (SELECT MIN(Id)
             FROM ContactEmail
             WHERE ContactId = 993
             GROUP BY EmailAddress
            );

In 5.7 they can also use any_value.

A similar issue was pointed out by Guilhem :
http://clustra.no.oracle.com/orabugs/bug.php?id=16272375.

This is not a bug.
[22 Dec 2015 10:12] Sveta Smirnova
Thank you for the great explanation, Sreeharsha!

Just one note: http://clustra.no.oracle.com/ is not accessible by people who don't have access to Oracle's internal servers.
[22 Dec 2015 10:40] MySQL Verification Team
@Sveta - External bug of Bug #16272375 is Bug #68254

Thanks,
Umesh