MySQL Bugs: #115796: derived_merge affect exists-subselect condition pushdown

Bug #115796	derived_merge affect exists-subselect condition pushdown
Submitted:	8 Aug 2024 10:18	Modified:	8 Aug 2024 10:21
Reporter:	tianfeng li (OCA)	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	8.0.30,8.4.0	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
create table t1 (s1 int, id int, primary key(id));
create table t2 (id int, a int, primary key(id));
create table t0 (c0 int);

# sql 1
explain format=tree select /*+JOIN_ORDER(t2,t1)*/ * from (select t1.s1 from t1 INNER join t2 on (t2.id = t1.id) where t2.a=150007279  and t1.id in ( select c0 from  t0  GROUP BY c0 having count(1) > 0) ) as t;

| -> Nested loop inner join  (cost=0.7 rows=1)
    -> Filter: (t2.a = 150007279)  (cost=0.35 rows=1)
        -> Table scan on t2  (cost=0.35 rows=1)
    -> Filter: <in_optimizer>(t2.id,<exists>(select #3))  (cost=0.35 rows=1)
        -> Single-row index lookup on t1 using PRIMARY (id=t2.id)  (cost=0.35 rows=1)
        -> Select #3 (subquery in condition; dependent)
            -> Limit: 1 row(s)
                -> Filter: ((count(1) > 0) and (<cache>(t1.id) = <ref_null_helper>(t0.c0)))
                    -> Table scan on <temporary>
                        -> Aggregate using temporary table
                            -> Table scan on t0  (cost=0.35 rows=1)

# sql 2
explain format=tree select /*+JOIN_ORDER(t2,t1)*/ t1.s1 from t1 INNER join t2 on (t2.id = t1.id) where t2.a=150007279  and t1.id in ( select c0 from  t0  GROUP BY c0 having count(1) > 0);

| -> Nested loop inner join  (cost=0.7 rows=1)
    -> Filter: ((t2.a = 150007279) and <in_optimizer>(t2.id,<exists>(select #2)))  (cost=0.35 rows=1)
        -> Table scan on t2  (cost=0.35 rows=1)
        -> Select #2 (subquery in condition; dependent)
            -> Limit: 1 row(s)
                -> Filter: ((count(1) > 0) and (<cache>(t1.id) = <ref_null_helper>(t0.c0)))
                    -> Table scan on <temporary>
                        -> Aggregate using temporary table
                            -> Table scan on t0  (cost=0.35 rows=1)
    -> Single-row index lookup on t1 using PRIMARY (id=t2.id)  (cost=0.35 rows=1)

The only difference bewteen sql#1 and sql#2 is the outer derived table.

In the SQL#2, a rewritten version without a derived table causes the EXISTS subselect to be pushed down to the first table t2, although with a plan display issue (see bug#115794).

SQL#1 includes a derived table and derived_merge should result in a plan equal to the one above. However, the EXISTS subselect is pushed down to the second table t1, while it should be pushed down to table t2.

How to repeat:
create table t1 (s1 int, id int, primary key(id));
create table t2 (id int, a int, primary key(id));
create table t0 (c0 int);

explain format=tree select /*+JOIN_ORDER(t2,t1)*/ * from (select t1.s1 from t1 INNER join t2 on (t2.id = t1.id) where t2.a=150007279  and t1.id in ( select c0 from  t0  GROUP BY c0 having count(1) > 0) ) as t;

explain format=tree select /*+JOIN_ORDER(t2,t1)*/ t1.s1 from t1 INNER join t2 on (t2.id = t1.id) where t2.a=150007279  and t1.id in ( select c0 from  t0  GROUP BY c0 having count(1) > 0);

Hello tianfeng li,

Thank you for the report and feedback

regards,
Umesh