Bug #114301 Incorrect result with IS NULL predicate in join condition
Submitted: 11 Mar 2024 16:40 Modified: 12 Mar 2024 13:10
Reporter: Yamac Karakus Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: IS NULL, join, UNION

[11 Mar 2024 16:40] Yamac Karakus
Description:
We incorrectly get an empty result set from a join query when we present the predicates in a certain order. Null values are incorrectly omitted from the result set. 

How to repeat:
create database test;
use test;
create table t1 (id int unsigned);
create table t2 (id int unsigned);
insert into t1 (id) values (1);

SELECT t1.id, t2Null.id FROM t1 JOIN (SELECT NULL AS id UNION SELECT id FROM t2) t2Null ON ((t2Null.id IS NULL) OR (t2Null.id = t1.id));

SELECT t1.id, t2Null.id FROM t1 JOIN (SELECT NULL AS id UNION SELECT id FROM t2) t2Null ON ((t2Null.id = t1.id) OR (t2Null.id IS NULL));

The second query returns the incorrect result. We should get a result set with 1 NULL row. 

Only NULL values are ignored, for example the following version of the query will incorrectly return 1 row in the result set with value 1. 

SELECT t1.id, t2Null.id FROM t1 JOIN ((SELECT NULL AS id UNION SELECT id FROM t2) UNION SELECT 1 AS id) t2Null ON ((t2Null.id = t1.id) OR (t2Null.id IS NULL));

but this version with the predicates in swapped order will produce two rows in the result set: 

SELECT t1.id, t2Null.id FROM t1 JOIN ((SELECT NULL AS id UNION SELECT id FROM t2) UNION SELECT 1 AS id) t2Null ON ((t2Null.id IS NULL) OR (t2Null.id = t1.id));
[12 Mar 2024 11:23] MySQL Verification Team
Hi Mr. Karakus,

Thank you for your bug report.

This is the output that we get from running your test case with our 8.0.36 binary. These results are expected results and we do not see anything wrong with them. 

Do note that when you run UNION without any predicate, you are running a distinct query.

These are the results:

-------------------------------------- FIRST SELECT -------------------------------------------------
id      id
1       NULL

-------------------------------------- EXPLAIN OF THE FIRST SELECT -----------------------------------
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    filtered        Extra
1       PRIMARY t1      NULL    ALL     NULL    NULL    NULL    NULL    1       100.00  NULL
1       PRIMARY <derived2>      NULL    ALL     NULL    NULL    NULL    NULL    3       55.56   Using where; Using join buffer (hash join)
2       DERIVED NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    No tables used
3       UNION   t2      NULL    ALL     NULL    NULL    NULL    NULL    1       100.00  NULL
4       UNION RESULT    <union2,3>      NULL    ALL     NULL    NULL    NULL    NULL    NULL    NULL    Using temporary
-------------------------------------- SECOND  SELECT -------------------------------------------------

-------------------------------------- EXPLAIN OF THE SECOND  SELECT -------------------------------
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    filtered        Extra
1       PRIMARY t1      NULL    ALL     NULL    NULL    NULL    NULL    1       100.00  NULL
1       PRIMARY <derived2>      NULL    eq_ref  <auto_distinct_key>     <auto_distinct_key>     5       test.t1.id      1       100.00  Using index
2       DERIVED NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    No tables used
3       UNION   t2      NULL    ALL     NULL    NULL    NULL    NULL    1       100.00  NULL
4       UNION RESULT    <union2,3>      NULL    ALL     NULL    NULL    NULL    NULL    NULL    NULL    Using temporary

-------------------------------------- THIRD  SELECT -------------------------------------------------
id      id
1       1

-------------------------------------- EXPLAIN OF THIRD  SELECT --------------------------------------
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    filtered        Extra
1       PRIMARY t1      NULL    ALL     NULL    NULL    NULL    NULL    1       100.00  NULL
1       PRIMARY <derived2>      NULL    eq_ref  <auto_distinct_key>     <auto_distinct_key>     9       test.t1.id      1       100.00  Using where; Using index
2       DERIVED NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    No tables used
3       UNION   t2      NULL    ALL     NULL    NULL    NULL    NULL    1       100.00  NULL
4       UNION   NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    No tables used
5       UNION RESULT    <union2,3,4>    NULL    ALL     NULL    NULL    NULL    NULL    NULL    NULL    Using temporary

-------------------------------------- FOURTH  SELECT -------------------------------------------------
id      id
1       NULL
1       1

-------------------------------------- EXPLAIN OF FOURTH  SELECT ------------------------------------
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    filtered        Extra
1       PRIMARY t1      NULL    ALL     NULL    NULL    NULL    NULL    1       100.00  NULL
1       PRIMARY <derived2>      NULL    ALL     NULL    NULL    NULL    NULL    4       43.75   Using where; Using join buffer (hash join)
2       DERIVED NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    No tables used
3       UNION   t2      NULL    ALL     NULL    NULL    NULL    NULL    1       100.00  NULL
4       UNION   NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    No tables used
5       UNION RESULT    <union2,3,4>    NULL    ALL     NULL    NULL    NULL    NULL    NULL    NULL    Using temporary

Not a bug.
[12 Mar 2024 13:10] MySQL Verification Team
Hi Mr. Karakus,

We have analysed carefully the output from the queries.

We concluded that  the only difference between the first and the second query is that the predicates in the OR clause are swapped. Thus, the results should be the same but they are not.

This is, therefore a proper bug.

Verified as reported.