MySQL Bugs: #29911: DISTINCT adds a result to what should be an empty result set

Bug #29911	DISTINCT adds a result to what should be an empty result set
Submitted:	19 Jul 2007 19:39	Modified:	27 Jul 2007 5:13
Reporter:	Charles Lindsay	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: General	Severity:	S2 (Serious)
Version:	5.0.45/4.1/5.1	OS:	Linux
Assigned to:	Igor Babaev	CPU Architecture:	Any
Tags:	distinct

Description:
Under the right conditions, using a fairly complicated SELECT DISTINCT query returns one row where the exact same query, minus DISTINCT, gives an empty result set.  I'm fairly sure that's incorrect behavior for DISTINCT.

The steps to reproduce below are fairly complicated, but it's as simple as I could get it and still show off the bug.  Due to the timestamp checking, this exact test will only show the bug until 2030--so you'd better hurry! =)

I've confirmed this in 5.0.27-Debian_0.dotdeb.1-log, 5.0.41, and 5.0.45 on Linux boxes only.

How to repeat:
create table a (a_id int auto_increment primary key);
create table b (b_id int auto_increment primary key, a_id int, time timestamp null);
create table c (b_id int, flag tinyint);

insert into a (a_id) values (1);
insert into b (b_id, a_id, time) values (1, 1, null),
   (2, 1, '2006-01-01 00:00:00'),
   (3, 1, '2030-01-01 00:00:00');
insert into c (b_id, flag) values (2, 1);

-- this fairly complicated query correctly returns an empty set
select a.a_id
from a
inner join b using (a_id)
left join c using (b_id)
where b.time <= NOW()
group by a.a_id, b.b_id
having ifnull(max(c.flag), 0) = 0;
-- Empty set (0.00 sec)

-- same exact query, but with distinct
select distinct a.a_id
from a
inner join b using (a_id)
left join c using (b_id)
where b.time <= NOW()
group by a.a_id, b.b_id
having ifnull(max(c.flag), 0) = 0;
-- +------+
-- | a_id |
-- +------+
-- |    1 | 
-- +------+
-- 1 row in set (0.00 sec)

Thank you for the bug report. Verified as described.

Here's the patch for the fix:

# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
#   2007/07/20 22:56:19-07:00 igor@olga.mysql.com 
#   Fixed bug #29911.
#   This bug manifested itself for join queries with GROUP BY and HAVING clauses
#   whose SELECT lists contained DISTINCT. It occurred when the optimizer could
#   deduce that the result set would have not more than one row.
#   The bug could lead to wrong result sets for queries of this type because
#   HAVING conditions were erroneously ignored in some cases in the function
#   remove_duplicates.   
# 
# mysql-test/r/having.result
#   2007/07/20 22:55:59-07:00 igor@olga.mysql.com +19 -0
#   Added a test case for bug #29911.
# 
# mysql-test/t/having.test
#   2007/07/20 22:55:59-07:00 igor@olga.mysql.com +26 -0
#   Added a test case for bug #29911.
# 
# sql/sql_select.cc
#   2007/07/20 22:55:59-07:00 igor@olga.mysql.com +1 -1
#   Fixed bug #29911.
#   This bug manifested itself for join queries with GROUP BY and HAVING clauses
#   whose SELECT lists contained DISTINCT. It occurred when the optimizer could
#   deduce that the result set would have not more than one row.
#   The bug could lead to wrong result sets for queries of this type because
#   HAVING conditions were erroneously ignored in some cases in the function
#   remove_duplicates.   
# 
diff -Nru a/mysql-test/r/having.result b/mysql-test/r/having.result
--- a/mysql-test/r/having.result	2007-07-21 11:05:13 -07:00
+++ b/mysql-test/r/having.result	2007-07-21 11:05:13 -07:00
@@ -158,3 +158,22 @@
 id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
 1	SIMPLE	NULL	NULL	NULL	NULL	NULL	NULL	NULL	Impossible HAVING
 DROP table t1;
+CREATE TABLE t1 (a int PRIMARY KEY);
+CREATE TABLE t2 (b int PRIMARY KEY, a int);
+CREATE TABLE t3 (b int, flag int);
+INSERT INTO t1 VALUES (1);
+INSERT INTO t2 VALUES (1,1), (2,1), (3,1);
+INSERT INTO t3(b,flag) VALUES (2, 1);
+SELECT t1.a
+FROM t1 INNER JOIN t2 ON t1.a=t2.a LEFT JOIN t3 ON t2.b=t3.b
+GROUP BY t1.a, t2.b HAVING MAX(t3.flag)=0;
+a
+SELECT DISTINCT t1.a, MAX(t3.flag)
+FROM t1 INNER JOIN t2 ON t1.a=t2.a LEFT JOIN t3 ON t2.b=t3.b
+GROUP BY t1.a, t2.b HAVING MAX(t3.flag)=0;
+a	MAX(t3.flag)
+SELECT DISTINCT t1.a
+FROM t1 INNER JOIN t2 ON t1.a=t2.a LEFT JOIN t3 ON t2.b=t3.b
+GROUP BY t1.a, t2.b HAVING MAX(t3.flag)=0;
+a
+DROP TABLE t1,t2,t3;
diff -Nru a/mysql-test/t/having.test b/mysql-test/t/having.test
--- a/mysql-test/t/having.test	2007-07-21 11:05:13 -07:00
+++ b/mysql-test/t/having.test	2007-07-21 11:05:13 -07:00
@@ -151,4 +151,30 @@
 
 DROP table t1;  
 
+#
+# Bug #29911: HAVING clause depending on constant table and evaluated to false
+#
+
+CREATE TABLE t1 (a int PRIMARY KEY);
+CREATE TABLE t2 (b int PRIMARY KEY, a int);
+CREATE TABLE t3 (b int, flag int);
+
+INSERT INTO t1 VALUES (1);
+INSERT INTO t2 VALUES (1,1), (2,1), (3,1);
+INSERT INTO t3(b,flag) VALUES (2, 1);
+
+SELECT t1.a
+  FROM t1 INNER JOIN t2 ON t1.a=t2.a LEFT JOIN t3 ON t2.b=t3.b
+    GROUP BY t1.a, t2.b HAVING MAX(t3.flag)=0;
+
+SELECT DISTINCT t1.a, MAX(t3.flag)
+  FROM t1 INNER JOIN t2 ON t1.a=t2.a LEFT JOIN t3 ON t2.b=t3.b
+    GROUP BY t1.a, t2.b HAVING MAX(t3.flag)=0;
+
+SELECT DISTINCT t1.a
+  FROM t1 INNER JOIN t2 ON t1.a=t2.a LEFT JOIN t3 ON t2.b=t3.b
+    GROUP BY t1.a, t2.b HAVING MAX(t3.flag)=0;
+
+DROP TABLE t1,t2,t3;
+
 # End of 4.1 tests
diff -Nru a/sql/sql_select.cc b/sql/sql_select.cc
--- a/sql/sql_select.cc	2007-07-21 11:05:13 -07:00
+++ b/sql/sql_select.cc	2007-07-21 11:05:13 -07:00
@@ -8118,7 +8118,7 @@
       field_count++;
   }
 
-  if (!field_count && !(join->select_options & OPTION_FOUND_ROWS)) 
+  if (!field_count && !(join->select_options & OPTION_FOUND_ROWS) && !having) 
   {                    // only const items with no OPTION_FOUND_ROWS
     join->unit->select_limit_cnt= 1;		// Only send first row
     DBUG_RETURN(0);

Pushed into 5.1.21-beta

Pushed into 5.0.48

Pushed into 4.1.24

Noted in 4.1.24, 5.0.48, 5.1.21 changelogs.

Adding DISTINCT could cause incorrect rows to appear in a query
result.