Bug #112853 Wrong functional dependency in ONLY_FULL_GROUP_BY mode
Submitted: 27 Oct 2023 8:32 Modified: 18 Dec 2023 2:17
Reporter: Xinhao Zhao Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[27 Oct 2023 8:32] Xinhao Zhao
Description:
Wrong query statements are allowed in SQL mode ONLY_FULL_GROUP_BY.

How to repeat:
CREATE TABLE t1 (a int primary key, b int);
INSERT INTO t1 VALUES (1, 1), (2, 3);

SELECT dt.a, dt.b FROM t1 LEFT JOIN (SELECT t2.a, COALESCE(t2.a, 3) as b FROM t1 LEFT JOIN t1 AS t2 ON t1.a = t2.b) dt ON dt.b = t1.b AND t1.b <> 1;  # (1)

SELECT dt.a, dt.b FROM t1 LEFT JOIN (SELECT t2.a, COALESCE(t2.a, 3) as b FROM t1 LEFT JOIN t1 AS t2 ON t1.a = t2.b) dt ON dt.b = t1.b AND t1.b <> 1 GROUP BY dt.a;  # (2)

The result for (1):
+------+---+
| a    | b |
+------+---+
| NULL | NULL |
| NULL | 3 |
+------+---+

The result for (2):
+------+---+
| a    | b |
+------+---+
| NULL | NULL |
+------+---+

Suggested fix:
Functional dependency from primary key to function of primary key is not NFFD if the primary key is a field of inner table of outer join.

There is a statement in mtr test main.group_by_fd_no_prot:

--echo # And thus {pk}->{coa} holds in the result of this query using the view;
--echo # if there is NULL-complementing in the LEFT JOIN below then (pk,coa)
--echo # will be (NULL,NULL) and if there is not it will be (non-NULL,3):
--echo # v2.coa is determined by v2.pk. The key fact is that v2.pk is not
--echo # NULLable so this is a NFFD.
SELECT v2.pk, v2.coa
 FROM t1 LEFT JOIN v2 AS v2 ON 0
 GROUP BY v2.pk;

This is wrong, the server should throw error ER_WRONG_FIELD_WITH_GROUP.
[27 Oct 2023 8:51] MySQL Verification Team
Hello Xinhao Zhao,

Thank you for the report and testcase.

regards,
Umesh
[13 Dec 2023 11:30] shan he
Could you please confirm whether the simplification process we applied has been helpful for developers in diagnosing and addressing this issue?

Although not obvious for this example, it removes "dt.b = t1.b AND".

Here's our output
*****

CREATE TABLE t1 (a int primary key, b int);
INSERT INTO t1 VALUES (1, 1), (2, 3);

SELECT dt.a, dt.b FROM t1 LEFT JOIN (SELECT t2.a, COALESCE(t2.a, 3) as b FROM t1 LEFT JOIN t1 AS t2 ON t1.a = t2.b) dt ON  t1.b <> 1;
+------+---+
| a    | b |
+------+---+
| NULL | NULL |
| NULL | 3 |
|    1 | 1 |
+------+---+

SELECT dt.a, dt.b FROM t1 LEFT JOIN (SELECT t2.a, COALESCE(t2.a, 3) as b FROM t1 LEFT JOIN t1 AS t2 ON t1.a = t2.b) dt ON  t1.b <> 1 GROUP BY dt.a; 

+------+------+
| a    | b    |
+------+------+
| NULL | NULL |
|    1 |    1 |
+------+------+
[18 Dec 2023 2:17] Xinhao Zhao
Yes. The bug exhibited by this simplified statement is consistent with the original statement.