Description:
I found a query where the optimizer does not detect an impossible INNER JOIN condition involving IS NULL and a normal equality predicate.
The condition is logically unsatisfiable:
t1.c0 IS NULL AND t1.c0 = t0.c0
Since normal equality (=) is NULL-rejecting, t1.c0 = t0.c0 can never be TRUE when t1.c0 IS NULL. Therefore, this query should be optimized as an empty result, similar to an "Impossible WHERE" plan.
However, MySQL still generates and executes normal nested-loop/index-lookup plans. In my test case, both tables contain NULL values on the join key, but the normal equality join must still return zero rows under SQL NULL semantics.
Sanity check from the reproducer:
t0_rows = 52000, t0_null_c0 = 2000
t1_rows = 52000, t1_null_c0 = 2000
eq_join_should_be_zero = 0
null_safe_join_has_rows = 1
This confirms that NULL rows exist on both sides, and that the empty result is caused by normal equality (=), not by lack of matching data.
Problematic query:
SELECT t1.c1 AS ref0, t0.c0
FROM t1 INNER JOIN t0 ON t1.c0 = t0.c0
WHERE t1.c0 IS NULL
ORDER BY t0.c3 ASC, t0.c0 DESC
LIMIT 2;
Observed behavior on MySQL 9.7.1:
MySQL does not produce an impossible/empty plan. Instead, EXPLAIN ANALYZE shows a real nested-loop join:
-> Limit: 2 row(s) (cost=414107 rows=2) (actual time=1870..1870 rows=0 loops=1)
-> Nested loop inner join (cost=414107 rows=4e+6) (actual time=1870..1870 rows=0 loops=1)
-> Sort: t0.c3, t0.c0 DESC (cost=417 rows=2000) (actual time=3.56..3.82 rows=2000 loops=1)
-> Index lookup on t0 using i0 (c0 = NULL), with index condition: (t0.c0 is null)
(cost=417 rows=2000) (actual time=0.0116..3.04 rows=2000 loops=1)
-> Filter: (t1.c0 = t0.c0) (cost=6.95 rows=2000) (actual time=0.933..0.933 rows=0 loops=2000)
-> Covering index lookup on t1 using i3 (c0 = NULL)
(cost=6.95 rows=2000) (actual time=0.00163..0.842 rows=2000 loops=2000)
This means MySQL reads 2000 NULL-key rows from t0, then performs 2000 index lookups into t1, each returning 2000 NULL-key rows, and only filters all candidates at execution time.
The same issue also appears without ORDER BY/LIMIT:
SELECT t1.c1 AS ref0, t0.c0
FROM t1 INNER JOIN t0 ON t1.c0 = t0.c0
WHERE t1.c0 IS NULL;
EXPLAIN ANALYZE still shows a nested-loop join and real index lookups:
-> Nested loop inner join (cost=409408 rows=4e+6) (actual time=1710..1710 rows=0 loops=1)
-> Filter: (t1.c0 is null) (cost=207 rows=2000) (actual time=0.00742..1.32 rows=2000 loops=1)
-> Covering index lookup on t1 using i3 (c0 = NULL) (cost=207 rows=2000) (actual time=0.00659..1.13 rows=2000 loops=1)
-> Filter: (t0.c0 = t1.c0) (cost=4.7 rows=2000) (actual time=0.854..0.854 rows=0 loops=2000)
-> Covering index lookup on t0 using i0 (c0 = NULL) (cost=4.7 rows=2000) (actual time=0.00159..0.761 rows=2000 loops=2000)
Therefore, this does not seem to be only an ORDER BY/LIMIT planning issue. The more fundamental issue is missed contradiction detection for IS NULL combined with normal equality in an INNER JOIN.
Expected behavior:
The optimizer should recognize that the condition is unsatisfiable:
t1.c0 IS NULL AND t1.c0 = t0.c0
Because normal equality (=) is NULL-rejecting, the query block should be optimized as an empty result. MySQL should avoid scanning NULL-key ranges, performing index lookups, and doing filesort.
Additional observation:
With a JOIN_ORDER(t1, t0) hint, MySQL chooses a different plan and runs faster, but it still does not detect the impossible condition. The hint plan still executes a nested-loop join and filters all rows at execution time. This is secondary evidence; the main issue is the missed impossible-condition optimization.
How to repeat:
Run the attached SQL reproducer on MySQL 9.7.1.
Short version of the steps:
1. Create two InnoDB tables t0 and t1 with nullable BIGINT join key c0.
2. Insert 50000 non-NULL matching rows into each table.
3. Insert 2000 rows where c0 IS NULL into each table.
4. Create indexes:
CREATE INDEX i0 ON t0(c0);
CREATE INDEX i1 ON t0(c0, c1);
CREATE INDEX i2 ON t1(c0);
CREATE INDEX i3 ON t1(c0, c1);
5. Run ANALYZE TABLE t0, t1.
6. Verify the sanity checks:
SELECT COUNT(*) AS t0_rows, SUM(c0 IS NULL) AS t0_null_c0 FROM t0;
SELECT COUNT(*) AS t1_rows, SUM(c0 IS NULL) AS t1_null_c0 FROM t1;
Expected:
t0_rows = 52000, t0_null_c0 = 2000
t1_rows = 52000, t1_null_c0 = 2000
7. Verify that the normal equality join is empty:
SELECT COUNT(*) AS eq_join_should_be_zero
FROM t1 INNER JOIN t0 ON t1.c0 = t0.c0
WHERE t1.c0 IS NULL;
Expected:
eq_join_should_be_zero = 0
8. Verify that NULL-safe equality would match rows:
SELECT EXISTS(
SELECT 1
FROM t1 INNER JOIN t0 ON t1.c0 <=> t0.c0
WHERE t1.c0 IS NULL
LIMIT 1
) AS null_safe_join_has_rows;
Expected:
null_safe_join_has_rows = 1
9. Run EXPLAIN FORMAT=JSON and EXPLAIN ANALYZE for:
SELECT t1.c1 AS ref0, t0.c0
FROM t1 INNER JOIN t0 ON t1.c0 = t0.c0
WHERE t1.c0 IS NULL
ORDER BY t0.c3 ASC, t0.c0 DESC
LIMIT 2;
Observed:
MySQL does not produce an impossible/empty plan. It executes a nested-loop/index-lookup plan, reads NULL-key rows, performs repeated lookups, and filters all rows at execution time.
The attached output file contains the full EXPLAIN FORMAT=JSON and EXPLAIN ANALYZE output.
Suggested fix:
During condition simplification / equality propagation / join optimization, detect contradictions where a column is constrained to IS NULL and is also used in a normal equality predicate that must evaluate to TRUE for an INNER JOIN.
For example:
t1.c0 IS NULL AND t1.c0 = t0.c0
For normal equality (=), if t1.c0 is NULL then t1.c0 = t0.c0 cannot be TRUE, even if t0.c0 is also NULL. Therefore this predicate is NULL-rejecting and the query block is unsatisfiable.
The optimizer could mark the query block as impossible/empty, similar to an "Impossible WHERE" plan.
This should not apply to NULL-safe equality (<=>), because:
t1.c0 IS NULL AND t1.c0 <=> t0.c0
can match rows where both sides are NULL.
Description: I found a query where the optimizer does not detect an impossible INNER JOIN condition involving IS NULL and a normal equality predicate. The condition is logically unsatisfiable: t1.c0 IS NULL AND t1.c0 = t0.c0 Since normal equality (=) is NULL-rejecting, t1.c0 = t0.c0 can never be TRUE when t1.c0 IS NULL. Therefore, this query should be optimized as an empty result, similar to an "Impossible WHERE" plan. However, MySQL still generates and executes normal nested-loop/index-lookup plans. In my test case, both tables contain NULL values on the join key, but the normal equality join must still return zero rows under SQL NULL semantics. Sanity check from the reproducer: t0_rows = 52000, t0_null_c0 = 2000 t1_rows = 52000, t1_null_c0 = 2000 eq_join_should_be_zero = 0 null_safe_join_has_rows = 1 This confirms that NULL rows exist on both sides, and that the empty result is caused by normal equality (=), not by lack of matching data. Problematic query: SELECT t1.c1 AS ref0, t0.c0 FROM t1 INNER JOIN t0 ON t1.c0 = t0.c0 WHERE t1.c0 IS NULL ORDER BY t0.c3 ASC, t0.c0 DESC LIMIT 2; Observed behavior on MySQL 9.7.1: MySQL does not produce an impossible/empty plan. Instead, EXPLAIN ANALYZE shows a real nested-loop join: -> Limit: 2 row(s) (cost=414107 rows=2) (actual time=1870..1870 rows=0 loops=1) -> Nested loop inner join (cost=414107 rows=4e+6) (actual time=1870..1870 rows=0 loops=1) -> Sort: t0.c3, t0.c0 DESC (cost=417 rows=2000) (actual time=3.56..3.82 rows=2000 loops=1) -> Index lookup on t0 using i0 (c0 = NULL), with index condition: (t0.c0 is null) (cost=417 rows=2000) (actual time=0.0116..3.04 rows=2000 loops=1) -> Filter: (t1.c0 = t0.c0) (cost=6.95 rows=2000) (actual time=0.933..0.933 rows=0 loops=2000) -> Covering index lookup on t1 using i3 (c0 = NULL) (cost=6.95 rows=2000) (actual time=0.00163..0.842 rows=2000 loops=2000) This means MySQL reads 2000 NULL-key rows from t0, then performs 2000 index lookups into t1, each returning 2000 NULL-key rows, and only filters all candidates at execution time. The same issue also appears without ORDER BY/LIMIT: SELECT t1.c1 AS ref0, t0.c0 FROM t1 INNER JOIN t0 ON t1.c0 = t0.c0 WHERE t1.c0 IS NULL; EXPLAIN ANALYZE still shows a nested-loop join and real index lookups: -> Nested loop inner join (cost=409408 rows=4e+6) (actual time=1710..1710 rows=0 loops=1) -> Filter: (t1.c0 is null) (cost=207 rows=2000) (actual time=0.00742..1.32 rows=2000 loops=1) -> Covering index lookup on t1 using i3 (c0 = NULL) (cost=207 rows=2000) (actual time=0.00659..1.13 rows=2000 loops=1) -> Filter: (t0.c0 = t1.c0) (cost=4.7 rows=2000) (actual time=0.854..0.854 rows=0 loops=2000) -> Covering index lookup on t0 using i0 (c0 = NULL) (cost=4.7 rows=2000) (actual time=0.00159..0.761 rows=2000 loops=2000) Therefore, this does not seem to be only an ORDER BY/LIMIT planning issue. The more fundamental issue is missed contradiction detection for IS NULL combined with normal equality in an INNER JOIN. Expected behavior: The optimizer should recognize that the condition is unsatisfiable: t1.c0 IS NULL AND t1.c0 = t0.c0 Because normal equality (=) is NULL-rejecting, the query block should be optimized as an empty result. MySQL should avoid scanning NULL-key ranges, performing index lookups, and doing filesort. Additional observation: With a JOIN_ORDER(t1, t0) hint, MySQL chooses a different plan and runs faster, but it still does not detect the impossible condition. The hint plan still executes a nested-loop join and filters all rows at execution time. This is secondary evidence; the main issue is the missed impossible-condition optimization. How to repeat: Run the attached SQL reproducer on MySQL 9.7.1. Short version of the steps: 1. Create two InnoDB tables t0 and t1 with nullable BIGINT join key c0. 2. Insert 50000 non-NULL matching rows into each table. 3. Insert 2000 rows where c0 IS NULL into each table. 4. Create indexes: CREATE INDEX i0 ON t0(c0); CREATE INDEX i1 ON t0(c0, c1); CREATE INDEX i2 ON t1(c0); CREATE INDEX i3 ON t1(c0, c1); 5. Run ANALYZE TABLE t0, t1. 6. Verify the sanity checks: SELECT COUNT(*) AS t0_rows, SUM(c0 IS NULL) AS t0_null_c0 FROM t0; SELECT COUNT(*) AS t1_rows, SUM(c0 IS NULL) AS t1_null_c0 FROM t1; Expected: t0_rows = 52000, t0_null_c0 = 2000 t1_rows = 52000, t1_null_c0 = 2000 7. Verify that the normal equality join is empty: SELECT COUNT(*) AS eq_join_should_be_zero FROM t1 INNER JOIN t0 ON t1.c0 = t0.c0 WHERE t1.c0 IS NULL; Expected: eq_join_should_be_zero = 0 8. Verify that NULL-safe equality would match rows: SELECT EXISTS( SELECT 1 FROM t1 INNER JOIN t0 ON t1.c0 <=> t0.c0 WHERE t1.c0 IS NULL LIMIT 1 ) AS null_safe_join_has_rows; Expected: null_safe_join_has_rows = 1 9. Run EXPLAIN FORMAT=JSON and EXPLAIN ANALYZE for: SELECT t1.c1 AS ref0, t0.c0 FROM t1 INNER JOIN t0 ON t1.c0 = t0.c0 WHERE t1.c0 IS NULL ORDER BY t0.c3 ASC, t0.c0 DESC LIMIT 2; Observed: MySQL does not produce an impossible/empty plan. It executes a nested-loop/index-lookup plan, reads NULL-key rows, performs repeated lookups, and filters all rows at execution time. The attached output file contains the full EXPLAIN FORMAT=JSON and EXPLAIN ANALYZE output. Suggested fix: During condition simplification / equality propagation / join optimization, detect contradictions where a column is constrained to IS NULL and is also used in a normal equality predicate that must evaluate to TRUE for an INNER JOIN. For example: t1.c0 IS NULL AND t1.c0 = t0.c0 For normal equality (=), if t1.c0 is NULL then t1.c0 = t0.c0 cannot be TRUE, even if t0.c0 is also NULL. Therefore this predicate is NULL-rejecting and the query block is unsatisfiable. The optimizer could mark the query block as impossible/empty, similar to an "Impossible WHERE" plan. This should not apply to NULL-safe equality (<=>), because: t1.c0 IS NULL AND t1.c0 <=> t0.c0 can match rows where both sides are NULL.