Bug #113318 unnest scalar subquery using groupby produces wrong result
Submitted: 4 Dec 2023 6:37 Modified: 1 Mar 20:38
Reporter: chen jiang Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:mysql-8.0.33, 8.0.35, 8.1.0, 8.2.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: Scalar subquery, unnest, wrong result

[4 Dec 2023 6:37] chen jiang
Description:
the scalar subquery which contains function with 'never be null' property(null_on_null=false) in subquery's projection columns  is incorrectly rewritted, leading to a mismatch in the result set.
e.g.
desc select (select ifnull(sum(t3.b), 0) from t3 where t1.a=t3.a) from t1 ;
will be rewritted to
select `derived_1_2`.`ifnull(sum(t3.b), 0)` AS `(select ifnull(sum(t3.b), 0) from t3 where t1.a=t3.a)` from `t1` left join (/* select#2 */ select ifnull(sum(`t3`.`b`),0) AS `ifnull(sum(t3.b), 0)`,`t3`.`a` AS `a` from `t3` group by `t3`.`a`) `derived_1_2` on((`derived_1_2`.`a` = `t1`.`a`)) where true

When t1 cannot left join with derived_1_2, the related columns of derived_1_2 will be filled with NULL, rather than re-executing the ifnull function.

How to repeat:
CREATE TABLE t1(a INT);
create table t3(a int, b int);
INSERT INTO t1 VALUES(1), (2), (3), (4);
INSERT INTO t3 VALUES(5,5), (6,6);
ANALYZE TABLE t1, t2, t3;

set optimizer_switch='subquery_to_derived=off';
select (select ifnull(sum(t3.b), 0) from t3 where t1.a=t3.a) from t1 ;
+-------------------------------------------------------+
| (select ifnull(sum(t3.b), 0) from t3 where t1.a=t3.a) |
+-------------------------------------------------------+
|                                                     0 |
|                                                     0 |
|                                                     0 |
|                                                     0 |
+-------------------------------------------------------+

set optimizer_switch='subquery_to_derived=on';
select (select ifnull(sum(t3.b), 0) from t3 where t1.a=t3.a) from t1 ;
+-------------------------------------------------------+
| (select ifnull(sum(t3.b), 0) from t3 where t1.a=t3.a) |
+-------------------------------------------------------+
|                                                  NULL |
|                                                  NULL |
|                                                  NULL |
|                                                  NULL |
+-------------------------------------------------------+
4 rows in set (0.00 sec)
[4 Dec 2023 7:08] MySQL Verification Team
Hello chen jiang,

Thank you for the report and test case.
Verified as described.

regards,
Umesh
[1 Mar 20:38] Jon Stephens
Documented fix as follows in the MySQL 8.4.0 changelog:

    A transform could be semantically invalid when the selected
    item in the subquery tested for NULL; the left outer join with a
    grouped derived table might in such cases produce NULLs whereas
    the original subquery might not. To prevent this from happening,
    we now bar such subqueries from being transformed.

Closed.