Bug #118269 The result of anti disk hash join is wrong
Submitted: 26 May 9:41 Modified: 26 May 12:29
Reporter: Jingqi Tian (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version: OS:Any
Assigned to: CPU Architecture:Any

[26 May 9:41] Jingqi Tian
Description:
In anti join, if disk-hash-join is used and the join-key of the driver table contains a null value, the result will be incorrect.

How to repeat:
1. create table

CREATE TABLE t1 (id int primary key, col1 varchar(200));
CREATE TABLE t2 (id int primary key, col1 varchar(200));

2. create procedure and insert data

DELIMITER //
CREATE PROCEDURE insert_data ()
BEGIN
	DECLARE num int;
	SET num = 1;
	WHILE num <= 1000 DO
    INSERT INTO t2 VALUES (num, md5(rand()));
	SET num = num + 1;
	END WHILE;
END;//
DELIMITER ;

call insert_data();
INSERT INTO t1 VALUES (1, null);

3. Execute SQL:

When join_buffer_size is large, execute sql:

> set join_buffer_size = 10240000;
> select * from t1 left join t2 on t1.col1 = t2.col1 where t2.id is null;

We will get:

+----+------+------+------+
| id | col1 | id   | col1 |
+----+------+------+------+
|  1 | NULL | NULL | NULL |
+----+------+------+------+

When join_buffer_size is small, execute sql:

> set join_buffer_size = 1024;
> select * from t1 left join t2 on t1.col1 = t2.col1 where t2.id is null;

We will get empty set.
[26 May 11:12] Jingqi Tian
I noticed that this bug has been fixed in 8.0.41. You can close this bug report. Thanks.
[26 May 12:29] MySQL Verification Team
Hello Jingqi Tian,

Thank you for confirming, closing for now but feel free to re-open if you need to. Thank you.

regards,
Umesh