Bug #79591 select distinct not returning distinct rows in 5.7...
Submitted: 10 Dec 2015 11:14 Modified: 9 May 2016 14:32
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.7.9 OS:Any
Assigned to: CPU Architecture:Any
Tags: regression

[10 Dec 2015 11:14] Shane Bester
Description:
Resultset differs between 5.6 and 5.7:

5.5.47, 5.6.28:
---------------
+-----------+------+------+
| f         | h    | v    |
+-----------+------+------+
| duplicate | NULL | NULL |
+-----------+------+------+
1 row in set (0.00 sec)

5.7.9, 5.8.0:
-------------
+-----------+------+------+
| f         | h    | v    |
+-----------+------+------+
| duplicate | NULL | NULL |
| duplicate | NULL | NULL |
+-----------+------+------+
2 rows in set (0.01 sec)

How to repeat:
set sql_mode='';
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8;
create table t1 (a int,b int,c int,d int,e int,f text) engine=innodb default charset=latin1;
create table t2 (g int,h char(2) default 'a') engine=innodb default charset=latin1;
create table t3 (i int) engine=innodb;
create table t4 (j int,  k int) engine=innodb default charset=latin1;
create table t5 (m int) engine=innodb;
create table t6 (n int,p int,q int) engine=innodb;
create table t7 (r int,s int) engine=innodb;
create table t8 (t int,v int) engine=innodb;
insert into t1 values (1,1,0,0,1, 'duplicate');
insert into t2 values (1,null);
insert into t3 values (1);
insert into t4 values(0,1),(1,1);
insert into t6 values (1,0,1);
insert into t7 values (1,0);
select version();
select distinct t1.f,t2.h,t8.v
from t1 inner join t2 on t1.b=t2.g
inner join t3 on t1.a=t3.i
inner join ( t6 inner join t7 on t6.q=t7.r and t6.p=t7.s ) on t1.e=t6.n
left join  ( t4 left join  t5 on t4.j=t5.m) on  t4.k='1'
left join t8 on t1.c=t8.t where t1.d=0 ;
[10 Dec 2015 12:03] Øystein Grøvlen
Posted by developer:
 
Query plan is the same in both 5.6 and 5.7:

MySQL 5.6.25:

mysql> explain select distinct t1.f,t2.h,t8.v from t1 inner join t2 on t1.b=t2.g inner join t3 on t1.a=t3.i inner join ( t6 inner join t7 on t6.q=t7.r and t6.p=t7.s ) on t1.e=t6.n left join  ( t4 left join  t5 on t4.j=t5.m) on  t4.k='1' left join t8 on t1.c=t8.t where t1.d=0;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                              |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    1 | Using where; Using temporary                       |
|  1 | SIMPLE      | t2    | ALL  | NULL          | NULL | NULL    | NULL |    1 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t3    | ALL  | NULL          | NULL | NULL    | NULL |    1 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t6    | ALL  | NULL          | NULL | NULL    | NULL |    1 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t7    | ALL  | NULL          | NULL | NULL    | NULL |    1 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t4    | ALL  | NULL          | NULL | NULL    | NULL |    2 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t5    | ALL  | NULL          | NULL | NULL    | NULL |    1 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t8    | ALL  | NULL          | NULL | NULL    | NULL |    1 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
8 rows in set (0,00 sec)

MySQL 5.7.9:

mysql> explain select distinct t1.f,t2.h,t8.v from t1 inner join t2 on t1.b=t2.g inner join t3 on t1.a=t3.i inner join ( t6 inner join t7 on t6.q=t7.r and t6.p=t7.s ) on t1.e=t6.n left join  ( t4 left join  t5 on t4.j=t5.m) on  t4.k='1' left join t8 on t1.c=t8.t where t1.d=0;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where; Using temporary                       |
|  1 | SIMPLE      | t2    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t3    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t6    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t7    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t4    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t5    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | t8    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
8 rows in set, 1 warning (0,00 sec)
[10 Dec 2015 13:31] Øystein Grøvlen
Posted by developer:
 
If I use MYISAM for disk-based temporary tables, the result is correct in 5.7.9:

mysql> set global internal_tmp_disk_storage_engine=MYISAM;
Query OK, 0 rows affected (0,00 sec)

mysql> select distinct t1.f,t2.h,t8.v from t1 inner join t2 on t1.b=t2.g inner join t3 on t1.a=t3.i inner join ( t6 inner join t7 on t6.q=t7.r and t6.p=t7.s ) on t1.e=t6.n left join  ( t4 left join  t5 on t4.j=t5.m) on  t4.k='1' left join t8 on t1.c=t8.t where t1.d=0;
+-----------+------+------+
| f         | h    | v    |
+-----------+------+------+
| duplicate | NULL | NULL |
+-----------+------+------+
1 row in set (0,01 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.9     |
+-----------+
1 row in set (0,00 sec)
[9 May 2016 14:32] Paul DuBois
Posted by developer:
 
Noted in 5.7.13 changelog.

A query could return incorrect results under these conditions: A
column with a default value contained NULL; SELECT DISTINCT or a
GROUP BY clause was used and the column containing the NULL value was
part of the select list; an InnoDB temporary table was used during
query processing.
[4 Aug 2017 13:00] Zoltán Szalai
I run into a similar problem, wich is present in 5.7.11, and seems to be fixed in 5.7.13. The number of returned columns depended on the server version.
The difference is, that  "set global internal_tmp_disk_storage_engine=MYISAM;" did not help in my case. No columns with default values were in the select.

I do not report it, and do not provide a test case, because it is already fixed.

BTW, my query was: (after simplifications for testing, all selects and joins on integers)

SELECT  attendreg.idattendreg, attendreg.workerid, attendreg.hidref  
FROM shifts AS shifts_1 INNER JOIN attendreg ON shifts_1.idworker = attendreg.workerid

LEFT JOIN 

((teaching INNER JOIN shifts ON teaching.targetid = shifts.groupid) INNER JOIN ((docdistro INNER JOIN docchange ON docdistro.DID = docchange.DID) 
INNER JOIN linecfg ON docdistro.LID = linecfg.cfgID) ON (shifts.lineid = linecfg.LID) AND (teaching.dokid = docchange.DID)) 

ON (shifts.idworker = attendreg.workerid AND docchange.HID = attendreg.hidref) 

 where ((shifts.idworker is null);