Bug #106473 | Inner table of OUT-JOIN is nullable, after applying OUTER2INNER transformation | ||
---|---|---|---|
Submitted: | 16 Feb 2022 8:55 | Modified: | 10 Mar 2022 9:27 |
Reporter: | Shanshan Ying | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 8.0.28 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | Optimizer |
[16 Feb 2022 8:55]
Shanshan Ying
[16 Feb 2022 10:03]
MySQL Verification Team
Hello Shanshan Ying, Thank you for the report and test case. regards, Umesh
[21 Feb 2022 13:28]
huahua xu
I do not think that the former query failed. The = operator return 0 if one or both operands are NULL, so the former query will rejecting null in an implicit way. You may get a expected query result by: select /*+ join_order(tz,tx,ty) */ * from tx left join ty on tx.a <=> ty.a join tz on ty.b <=> tz.b;
[21 Feb 2022 13:32]
huahua xu
In addition, you can refer to the method `update_ref_and_keys`->`add_key_fields`
[10 Mar 2022 9:27]
Shanshan Ying
Hey Xu, Seems you did not get my point. mysql [localhost:8027] {msandbox} (test) > explain format=tree select /*+ join_order(tz,tx,ty) */ * from tx left join ty on tx.a = ty.a join tz on ty.b = tz.b; +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | EXPLAIN | +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | -> Nested loop inner join (cost=1.05 rows=1) -> Inner hash join (no condition) (cost=0.70 rows=1) -> Table scan on tx (cost=0.35 rows=1) -> Hash -> Table scan on tz (cost=0.35 rows=1) -> Single-row covering index lookup on ty using idx_ab (a=tx.a, b=tz.b) (cost=0.35 rows=1) | +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.01 sec) mysql [localhost:8027] {msandbox} (test) > explain format=tree select /*+ join_order(tz,tx,ty) */ * from tx join ty on tx.a = ty.a join tz on ty.b = tz.b; +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | EXPLAIN | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | -> Nested loop inner join (cost=1.05 rows=1) -> Inner hash join (no condition) (cost=0.70 rows=1) -> Filter: (tx.a is not null) (cost=0.35 rows=1) -> Table scan on tx (cost=0.35 rows=1) -> Hash -> Filter: (tz.b is not null) (cost=0.35 rows=1) -> Table scan on tz (cost=0.35 rows=1) -> Single-row covering index lookup on ty using idx_ab (a=tx.a, b=tz.b) (cost=0.35 rows=1) | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) It is about perofrmance, not about correctness. I guess there is no doubt that there two query are logically equal. But, the first plan failed to pushdown (or add) is-not-null condition to base tables. These conditions can help filter some data in an early stage and enhance performance.