Bug #69521 incorrect result of left join with additional condition
Submitted: 20 Jun 2013 8:57 Modified: 26 Jun 2013 21:32
Reporter: martin fuxa Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: DML Severity:S2 (Serious)
Version:5.6 OS:Any
Assigned to: CPU Architecture:Any

[20 Jun 2013 8:57] martin fuxa
Description:
in dataset we are making 1:1 relation, in each table are exactly one row (see check below). With additional[s] condition (in left join) are always returned NULL from left joined table.
Without are returned correct values.
With inner join works fine.
On 5.5[.28] server works fine too! => behavior was changed between 5.5 and 5.6.

Tested on 5.6.12 client and server.

Maybe I'm wrong, but I can't find anything related in changelog OR join documentation.

check 1:1 relation
1. table tab2
select * from tab2 where ico='00001350';
+----------+
| ico      |
+----------+
| 00001350 |
+----------+
1 row in set (0.00 sec)

2. table tab3
select * from tab3 where ico='00001350';
+----------+----------+
| ico      | login    |
+----------+----------+
| 00001350 | JO013982 |
+----------+----------+
1 row in set (0.00 sec)

How to repeat:
-- import dataset
mysql test -B < test.sql

lets make join
-- pager, we are looking for someone row-id for simplification
pager grep 00001350;

select tab2.ico, tab3.login from tab2 left join tab3 on tab2.ico = tab3.ico group by tab2.ico;
-- inner join variant works fine
select tab2.ico, tab3.login from tab2 inner join tab3 on tab2.ico = tab3.ico and tab3.ico='00001350' group by tab2.ico;
-- and group by left joined table tab3 works fine
select tab2.ico, tab3.login from tab2 left join tab3 on tab2.ico = tab3.ico and tab3.ico='00001350' group by tab3.ico;
-- condition on another column works fine
select tab2.ico, tab3.login from tab2 left join tab3 on tab2.ico = tab3.ico and tab3.login='JO013982' group by tab2.ico;
| 00001350   | JO013982 |
39143 rows in set (0.00 sec)
works fine

select tab2.ico, tab3.login from tab2 left join tab3 on tab2.ico = tab3.ico and tab3.ico='00001350' group by tab2.ico;
| 00001350   | NULL  |
39143 rows in set (0.00 sec)
doesnt work!

at 5.5 server works fine
select tab2.ico, tab3.login from tab2 left join tab3 on tab2.ico = tab3.ico and tab3.ico='00001350' group by tab2.ico;
| 00001350   | JO013982 |
39143 rows in set (0.26 sec)
[20 Jun 2013 13:47] MySQL Verification Team
Hello Martin,

Thank you for the report.
Imho - this is not a bug. In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the name column in the select list does not appear in the GROUP BY: 

select tab2.ico
      ,tab3.login 
 from tab2 left join tab3 on tab2.ico = tab3.ico and tab3.ico='00001350' group by tab2.ico;
 
 For the query to be legal, the tab3.login column must be omitted from the select list or named in the GROUP BY clause.
 
 MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. 
 
 e.g
 
 mysql> pager grep 00001350;
 PAGER set to 'grep 00001350'
 mysql> select tab2.ico, tab3.login from tab2 left join tab3 on tab2.ico = tab3.ico and tab3.ico='00001350' group by tab2.ico;
 | 00001350   | NULL  |
 39143 rows in set (6.44 sec)
 
 mysql> 
 mysql> 
 mysql> select tab2.ico, tab3.login from tab2 left join tab3 on tab2.ico = tab3.ico and tab3.ico='00001350' group by tab2.ico,tab3.login;
 | 00001350   | JO013982 |
 39143 rows in set (10.65 sec)

Please reference - http://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html

Alternatively, if SQL MODE is set to ONLY_FULL_GROUP_BY, this doesn't permit queries for which the select list or HAVING list or (as of MySQL 5.6.5) ORDER list refers to nonaggregated columns that are not named in the GROUP BY clause.

mysql> set sql_mode='ONLY_FULL_GROUP_BY';
Query OK, 0 rows affected (0.00 sec)

mysql> select tab2.ico, tab3.login from tab2 left join tab3 on tab2.ico = tab3.ico and tab3.ico='00001350' group by tab2.ico;
ERROR 1055 (42000): 'test.tab3.login' isn't in GROUP BY
mysql> 
mysql> select tab2.ico, tab3.login from tab2 left join tab3 on tab2.ico = tab3.ico and tab3.ico='00001350' group by tab2.ico,tab3.login;
| 00001350   | JO013982 |
39143 rows in set (9.58 sec)

Please reference - http://dev.mysql.com/doc/refman/5.6/en/server-sql-mode.html#sqlmode_only_full_group_by

Thanks,
Umesh
[25 Jun 2013 13:12] martin fuxa
I agree with citation "MySQL extends the use of GROUP BY ...".

A important part
The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate."

But in this case, there is only one value in group = all are same = there isn't reason for indeterminate, and I expect this one will be selected. But Mysql 5.6 select NULL instead of value. Unlike Mysql 5.5 wich works as expected (select value).

And another working condition are when any function is used, like summation col+0 OR string function like CONCAT. Try
pager grep 00001350;
select tab2.ico, CONCAT(tab3.login, 'xxx') from tab2 left join tab3 on tab2.ico = tab3.ico and tab3.ico='00001350' group by tab2.ico;
which doesn't return NULL as query without CONCAT.

Why engine select NULL instead of value for LEFT JOIN?
Why this behavior was changed from 5.5?

Btw my sql_mode is
STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
imho doesnt impact this case.
[26 Jun 2013 21:32] MySQL Verification Team
Duplicate of http://bugs.mysql.com/bug.php?id=68254 and others already reported. Thanks.
[27 Jun 2013 6:23] Roy Lyseng
Actually, it looks more like a duplicate of bug#68897.
We have acknowledged this to be a problem and expect to provide a solution soon, even though there may be a non-determinism in the query.

The class of queries also allow for some deterministic queries: When the right-hand table is functionally dependent on the other table (ie there is a unique key on the joined columns), the result is deterministic and explicitly allowed in newer versions of the SQL standard.
[14 Nov 2013 14:56] Hartmut Holzgraefe
*not* a duplicate of 68897 as that is fixed in 5.6.14 while this one is still reproducible there ...