Bug #28337 NOT EXISTS with GROUP BY behaves different in 5.0.40 (regression)
Submitted: 9 May 2007 16:58 Modified: 27 May 2007 18:19
Reporter: Morgan Tocker Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.40 OS:Any
Assigned to: Igor Babaev CPU Architecture:Any
Tags: regression

[9 May 2007 16:58] Morgan Tocker
Description:
A patch for bug #27659 (introduced in 5.0.40) seems to have changed behavior for NOT EXISTS using a group by.

How to repeat:
create table tc_a (id double not null auto_increment, state char(2), 
  clientcode double, index(id), index(clientcode));

create table tc_b (id double, index(id)) engine=innodb;

# insert 512 rows

insert into tc_a VALUES (null, null, floor(rand()*100));
insert into tc_a SELECT null, null, floor(rand()*100) FROM tc_a;
insert into tc_a SELECT null, null, floor(rand()*100) FROM tc_a;
insert into tc_a SELECT null, null, floor(rand()*100) FROM tc_a;
insert into tc_a SELECT null, null, floor(rand()*100) FROM tc_a;
insert into tc_a SELECT null, null, floor(rand()*100) FROM tc_a;
insert into tc_a SELECT null, null, floor(rand()*100) FROM tc_a;
insert into tc_a SELECT null, null, floor(rand()*100) FROM tc_a;
insert into tc_a SELECT null, null, floor(rand()*100) FROM tc_a;
insert into tc_a SELECT null, null, floor(rand()*100) FROM tc_a;

# set a value for state

update tc_a set state='GA' where state is null limit 25;
update tc_a set state='FL' where state is null limit 25;
update tc_a set state='NC' where state is null limit 25;
update tc_a set state='CA' where state is null limit 25;
update tc_a set state='MI' where state is null limit 25;
update tc_a set state='ND' where state is null limit 25;
update tc_a set state='NY' where state is null;

# insert 100 random rows into b.
insert into tc_b select tc_a.id from tc_a order by rand() limit 100;

# I export the data at this point to testcase_data.sql

In MySQL 5.0.40 this returns:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.0.40-enterprise-gpl-log MySQL Enterprise Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> SELECT tc_a.id, '1' as 'type', state as 'wayda_state' FROM tc_a
    -> where state in ('GA','FL')
    -> AND NOT EXISTS(select 1 from tc_b WHERE tc_b.id=tc_a.id)
    -> group by tc_a.id;
+----+------+-------------+
| id | type | wayda_state |
+----+------+-------------+
|  1 | 1    | GA          | 
|  2 | 1    | GA          | 
+----+------+-------------+
2 rows in set (0.01 sec)

mysql> SELECT distinct tc_a.id, '1' as 'type', state as 'wayda_state' FROM tc_a
    -> where state in ('GA','FL')
    -> AND NOT EXISTS(select 1 from tc_b WHERE tc_b.id=tc_a.id)
    -> ;
+----+------+-------------+
| id | type | wayda_state |
+----+------+-------------+
|  1 | 1    | GA          | 
|  3 | 1    | GA          | 
|  4 | 1    | GA          | 
|  5 | 1    | GA          | 
|  6 | 1    | GA          | 
|  7 | 1    | GA          | 
|  8 | 1    | GA          | 
|  9 | 1    | GA          | 
| 10 | 1    | GA          | 
| 13 | 1    | GA          | 
| 14 | 1    | GA          | 
| 15 | 1    | GA          | 
| 19 | 1    | GA          | 
| 21 | 1    | GA          | 
| 22 | 1    | GA          | 
| 23 | 1    | GA          | 
| 24 | 1    | GA          | 
| 25 | 1    | GA          | 
| 26 | 1    | FL          | 
| 28 | 1    | FL          | 
| 29 | 1    | FL          | 
| 30 | 1    | FL          | 
| 31 | 1    | FL          | 
| 32 | 1    | FL          | 
| 33 | 1    | FL          | 
| 34 | 1    | FL          | 
| 37 | 1    | FL          | 
| 38 | 1    | FL          | 
| 40 | 1    | FL          | 
| 41 | 1    | FL          | 
| 42 | 1    | FL          | 
| 43 | 1    | FL          | 
| 45 | 1    | FL          | 
| 47 | 1    | FL          | 
| 48 | 1    | FL          | 
| 49 | 1    | FL          | 
| 50 | 1    | FL          | 
+----+------+-------------+
37 rows in set (0.00 sec)

But in MySQL 5.0.37 it returned 37 rows for both of the queries.
[9 May 2007 16:58] Morgan Tocker
Testcase data

Attachment: testcase_data.sql (application/octet-stream, text), 9.48 KiB.

[16 May 2007 6:39] Travers Carter
We have had problems with what appears to be the same bug in version 5.0.41 (on Linux x86_64), but only with InnoDB tables, if the table is created as MyISAM the problem doesn't occur.

This test case is the simplest we could isolate:

------------------- Test Case -----------------
USE test;

DROP TABLE IF EXISTS menu;

CREATE TABLE menu (
  id int
) ENGINE=InnoDB;

INSERT INTO menu (id) VALUES (1);

SELECT m1.*
  FROM menu m1
       JOIN menu m2
 WHERE EXISTS (SELECT * FROM menu m3 WHERE m1.id = m3.id)
 GROUP BY m1.id;

---------- MySQL 5.0.27 Gives ------------
+------+
| id   |
+------+
|    1 |
+------+
1 row in set (0.00 sec)

----------- MySQL 5.0.41 Gives ------------
Empty set (0.00 sec)
[16 May 2007 23:35] Igor Babaev
This problem can be demonstrated with the following test case:

mysql> SELECT VERSION();
+--------------+
| VERSION()    |
+--------------+
| 5.0.42-debug |
+--------------+
1 row in set (0.00 sec)

mysql> CREATE TABLE t1 (id int NOT NULL, st CHAR(2), INDEX idx(id));
Query OK, 0 rows affected (0.07 sec)

mysql> INSERT INTO t1 VALUES
    ->   (3,'FL'), (2,'GA'), (4,'FL'), (1,'GA'), (5,'NY'), (7,'FL'), (6,'NY');
Query OK, 7 rows affected (0.00 sec)
Records: 7  Duplicates: 0  Warnings: 0

mysql> CREATE TABLE t2 (id int NOT NULL, INDEX idx(id));
Query OK, 0 rows affected (0.05 sec)

mysql> INSERT INTO t2 VALUES (7), (5), (1), (3);
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT id, st FROM t1
    ->   WHERE st IN ('GA','FL') AND EXISTS(SELECT 1 FROM t2 WHERE t2.id=t1.id);
+----+------+
| id | st   |
+----+------+
|  3 | FL   |
|  1 | GA   |
|  7 | FL   |
+----+------+
3 rows in set (0.00 sec)

mysql>
mysql> SELECT id, st FROM t1
    ->   WHERE st IN ('GA','FL') AND EXISTS(SELECT 1 FROM t2 WHERE t2.id=t1.id)
    ->     GROUP BY id;
Empty set (0.00 sec)

mysql> SELECT id, st FROM t1
    ->   WHERE st IN ('GA','FL') AND NOT EXISTS(SELECT 1 FROM t2 WHERE t2.id=t1.id);
+----+------+
| id | st   |
+----+------+
|  2 | GA   |
|  4 | FL   |
+----+------+
2 rows in set (0.00 sec)

mysql> SELECT id, st FROM t1
    ->   WHERE st IN ('GA','FL') AND NOT EXISTS(SELECT 1 FROM t2 WHERE t2.id=t1.id)
    ->     GROUP BY id;
+----+------+
| id | st   |
+----+------+
|  3 | FL   |
+----+------+
1 row in set (0.00 sec)
[17 May 2007 5:41] Igor Babaev
This problem appeared after the patch for bug #27321 had been applied. The problem is a result of a bug in this patch. It affects only versions 5.0.40 and 5.1.18. 
The bug has nothing to do with the patch for bug #27659.
[17 May 2007 8:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/26885

ChangeSet@1.2490, 2007-05-16 23:42:10-07:00, igor@olga.mysql.com +3 -0
  Fixed bug #28337: wrong results for grouping queries with correlated
  subqueries in WHERE conditions.
  This bug was introduced by the patch for bug 27321.
[17 May 2007 8:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/26888

ChangeSet@1.2490, 2007-05-16 23:00:28-07:00, igor@olga.mysql.com +3 -0
  Fixed bug #28337: wrong results for grouping queries with correlated
  subqueries in WHERE conditions.
  This bug was introduced by the patch for bug 27321.
[20 May 2007 17:11] Bugs System
Pushed into 5.0.44
[20 May 2007 17:13] Bugs System
Pushed into 5.1.19-beta
[27 May 2007 18:19] Paul DuBois
Noted in 5.0.40, 5.1.19 changelogs.

Grouping queries with correlated subqueries in WHERE conditions could
produce incorrect results.
[31 May 2007 6:04] Igor Babaev
Bug #28378 and bug #28526 are marked as duplicates of this bug.
[15 Dec 2007 7:30] Igor Babaev
Bug #30928 is marked as a duplicate of this bug.