Bug #116398 Incorrect result with LEFT JOIN and small join_buffer_size
Submitted: 17 Oct 2024 14:34 Modified: 25 Oct 2024 9:18
Reporter: Aaditya Dubey Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[17 Oct 2024 14:34] Aaditya Dubey
Description:
This query returns an empty result even if the "tbl" table has two matching rows.

SELECT a.id, COUNT(*) AS n
FROM tbl AS a
LEFT JOIN fp_tbl AS f ON f.id = a.id
LEFT JOIN fp_pol AS p ON p.lead_id   = f.id
WHERE a.id IN (1,2) AND p.id IS NULL
GROUP BY a.id;
Empty set (0.39 sec)

mysql [localhost:8039] {msandbox} (test) > select * from tbl where id in ('1','2');
+----+------------+
| id | user_id    |
+----+------------+
|  1 |  583532949 |
|  2 | 1342458479 |
+----+------------+
2 rows in set (0.00 sec)

How to repeat:
use test;
CREATE TABLE `tbl` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

CREATE TABLE `fp_tbl` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

CREATE TABLE `fp_pol` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `lead_id` int unsigned DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

Insert rows into all three tables, please make sure to insert large number of rows in fp_pol table.

mysql_random_data_load --host=127.0.0.1 --port=8039 --user=msandbox --password=msandbox test tbl 10

mysql_random_data_load --host=127.0.0.1 --port=8039 --user=msandbox --password=msandbox test fp_tbl 10

mysql_random_data_load --host=127.0.0.1 --port=8039 --user=msandbox --password=msandbox test fp_pol 1000000

Delete rows with id 1,2 from fp_tbl, fp_pol tables.

delete from fp_tbl WHERE id in ('1','2');
delete from fp_pol WHERE id in ('1','2');

Verify that row id 1,2 are present in tbl only.

mysql [localhost:8039] {msandbox} (test) > select * from tbl where id in ('1','2');
+----+------------+
| id | user_id    |
+----+------------+
|  1 |  583532949 |
|  2 | 1342458479 |
+----+------------+
2 rows in set (0.00 sec)

mysql [localhost:8039] {msandbox} (test) > select * from fp_tbl where id in ('1','2');
Empty set (0.00 sec)

mysql [localhost:8039] {msandbox} (test) > select * from fp_pol where id in ('1','2');
Empty set (0.00 sec)

Run the query and confirm it doesn't return any matching rows.

SELECT a.id, COUNT(*) AS n
FROM tbl AS a
LEFT JOIN fp_tbl AS f ON f.id = a.id
LEFT JOIN fp_pol AS p ON p.lead_id   = f.id
WHERE a.id IN (1,2) AND p.id IS NULL
GROUP BY a.id;
Empty set (0.39 sec)

Run the explain and confirm that the query uses a join buffer for the fp_pol table as there is no index on the lead_id column.

mysql [localhost:8040] {msandbox} (test) > explain SELECT a.id, COUNT(*) AS n FROM tbl AS a LEFT JOIN fp_tbl AS f ON f.id = a.id LEFT JOIN fp_pol AS p ON p.lead_id   = f.id WHERE a.
    -> id IN (1,2) AND p.id IS NULL GROUP BY a.id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+--------+----------+--------------------------------------------------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref       | rows   | filtered | Extra                                                  |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+--------+----------+--------------------------------------------------------+
|  1 | SIMPLE      | a     | NULL       | range  | PRIMARY       | PRIMARY | 4       | NULL      |      2 |   100.00 | Using where; Using index; Using temporary              |
|  1 | SIMPLE      | f     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.a.id |      1 |   100.00 | Using index                                            |
|  1 | SIMPLE      | p     | NULL       | ALL    | NULL          | NULL    | NULL    | NULL      | 998566 |    10.00 | Using where; Not exists; Using join buffer (hash join) |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+--------+----------+--------------------------------------------------------+
3 rows in set, 1 warning (0.00 sec)

Workaround:
----------

Increasing the join_buffer_size helps the query to return the correct result.

set session join_buffer_size=64*1024*1024;

SELECT a.id, COUNT(*) AS n
FROM tbl AS a
LEFT JOIN fp_tbl AS f ON f.id = a.id
LEFT JOIN fp_pol AS p ON p.lead_id   = f.id
WHERE a.id IN (1,2) AND p.id IS NULL
GROUP BY a.id;
+----+---+
| id | n |
+----+---+
|  1 | 1 |
|  2 | 1 |
+----+---+
2 rows in set (0.34 sec)

Suggested fix:
It must return the correct set of results.
[17 Oct 2024 14:57] MySQL Verification Team
Hi Mr. Dubev,

Thank you very much for your bug report.

However, please do not use third-party data for generating data for the test.

You can use any of the original MySQL client programs, like mysqlslap, or sysbench.

We are very eagerly waiting your reply.

We are also interested what happens with large join_buffer_size. Also what exactly is "small" here ?????

Thanks in advance.
[17 Oct 2024 17:24] Aaditya Dubey
Hi,

Thank you for looking. I'll test it with Sysbench and update the test case.
[18 Oct 2024 9:33] MySQL Verification Team
Hi Mr. Dubey,

We shall be happy to repeat that test case.
[24 Oct 2024 12:24] Aaditya Dubey
Hi Team,

Please find the test case with sysbench:

1. Fist create two tables using the below sysbench command:

sysbench /usr/share/sysbench/oltp_read_write.lua --table-size=10 --tables=2 --mysql-db=test --mysql-user=msandbox --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox8040.sock --threads=2 --time=0 --report-interval=1 --events=0 --db-driver=mysql prepare

2. Create 3rd table using the below sysbench command:

sysbench /usr/share/sysbench/oltp_read_write.lua --table-size=1000000 --tables=3 --mysql-db=test --mysql-user=msandbox --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox8040.sock --threads=10 --time=0 --report-interval=1 --events=0 --db-driver=mysql prepare

You will notice error like Table 'sbtest1' already exists, please ignore because 3rd table will be created without any issue.

3. Now drop secondary index:

 alter table sbtest1 drop index k_1;
 alter table sbtest2 drop index k_2;
 alter table sbtest3 drop index k_3;

4. Run following Query:

mysql [localhost:8040] {msandbox} (test) > SELECT a.id, COUNT(*) AS n
    -> FROM sbtest1 AS a
    -> LEFT JOIN sbtest2 AS f ON f.id = a.id
    -> LEFT JOIN sbtest3 AS p ON p.k   = f.id
    -> WHERE a.id IN (1,2) AND p.id IS NULL
    -> GROUP BY a.id;
Empty set (1.12 sec)

No Results

5. Now change join_buffer_size

mysql [localhost:8040] {msandbox} (test) > set session join_buffer_size=64*1024*1024;
Query OK, 0 rows affected (0.00 sec)

6. Run the same query again:

mysql [localhost:8040] {msandbox} (test) > SELECT a.id, COUNT(*) AS n FROM sbtest1 AS a LEFT JOIN sbtest2 AS f ON f.id = a.id LEFT JOIN sbtest3 AS p ON p.k   = f.id WHERE a.id IN (1,2) AND p.id IS NULL GROUP BY a.id;
+----+---+
| id | n |
+----+---+
|  1 | 1 |
|  2 | 1 |
+----+---+
2 rows in set (0.59 sec)

Hopefully you can repeat the issue from your end now.
[24 Oct 2024 12:59] MySQL Verification Team
Hi Mr. Dubey,

We still can not repeat your test case on our build of MySQL 8.0.40.

Here is entire run in mysql client, after preparing the tables:

|ysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| sbtest1        |
| sbtest2        |
| sbtest3        |
+----------------+
3 rows in set (0.00 sec)

mysql> alter table sbtest1 drop index k_1;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table sbtest2 drop index k_2;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table sbtest3 drop index k_3;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> SELECT a.id, COUNT(*) AS n FROM sbtest1 AS a LEFT JOIN sbtest2 AS f ON f.id = a.id LEFT JOIN sbtest3 AS p ON p.k   = f.id WHERE a.id IN (1,2) AND p.id IS NULL GROUP BY a.id;
+----+---+
| id | n |
+----+---+
|  2 | 1 |
|  1 | 1 |
+----+---+
2 rows in set (0.25 sec)

mysql> set session join_buffer_size=512 * 1024;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT a.id, COUNT(*) AS n FROM sbtest1 AS a LEFT JOIN sbtest2 AS f ON f.id = a.id LEFT JOIN sbtest3 AS p ON p.k   = f.id WHERE a.id IN (1,2) AND p.id IS NULL GROUP BY a.id;
+----+---+
| id | n |
+----+---+
|  2 | 1 |
|  1 | 1 |
+----+---+
2 rows in set (0.24 sec)

mysql> set session join_buffer_size=256* 1024;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT a.id, COUNT(*) AS n FROM sbtest1 AS a LEFT JOIN sbtest2 AS f ON f.id = a.id LEFT JOIN sbtest3 AS p ON p.k   = f.id WHERE a.id IN (1,2) AND p.id IS NULL GROUP BY a.id;
+----+---+
| id | n |
+----+---+
|  2 | 1 |
|  1 | 1 |
+----+---+
2 rows in set (0.24 sec)

Can't repeat.
[24 Oct 2024 13:07] MySQL Verification Team
Hi Mr. Dubey,

Just for your information, at the start we used a default value for the variable, which was 256 Kb.

Can't repeat.
[24 Oct 2024 14:48] MySQL Verification Team
Hi,

I repeated it with below testcase. Check if it matches the problem reported.

drop database if exists test;
create database test;
use test;
create table `tbl` (
  `id` int unsigned not null auto_increment,
  `user_id` int default null,
  primary key (`id`)
) engine=innodb default charset=utf8mb4 collate=utf8mb4_0900_ai_ci;
replace into tbl(user_id) values(1),(2),(3),(4),(5),(6),(7),(8),(9),(0);
replace into tbl(user_id) select a.user_id from tbl a,tbl b,tbl c,tbl d;

create table `fp_tbl` (
  `id` int unsigned not null auto_increment,
  `user_id` int default null,
  primary key (`id`)
) engine=innodb default charset=utf8mb4 collate=utf8mb4_0900_ai_ci;
replace into fp_tbl(user_id) values(1),(2),(3),(4),(5),(6),(7),(8),(9),(0);
replace into fp_tbl(user_id) select a.user_id from fp_tbl a,fp_tbl b,fp_tbl c,fp_tbl d;

create table `fp_pol` (
  `id` int unsigned not null auto_increment,
  `lead_id` int unsigned default null,
  primary key (`id`)
) engine=innodb default charset=utf8mb4 collate=utf8mb4_0900_ai_ci;
set @a:=0;
replace into fp_pol(lead_id) values(1),(2),(3),(4),(5),(6),(7),(8),(9),(0);
replace into fp_pol(lead_id) select @a:=@a+1 from fp_pol a,fp_pol b,fp_pol c,fp_pol d;
replace into fp_pol(lead_id) select @a:=@a+1 from fp_pol a,fp_pol b limit 9989990;
replace into fp_pol(lead_id) select @a:=@a+1 from fp_pol a;

delete from fp_tbl where id in (1,2);
delete from fp_pol where id in (1,2);

analyze table fp_pol,fp_tbl,tbl;

select * from tbl where id in (1,2);
select * from fp_tbl where id in (1,2);
select * from fp_pol where id in (1,2);

explain select a.id, count(*) as n from tbl as a left join fp_tbl as f on f.id = a.id left join fp_pol as p on p.lead_id = f.id where a. id in (1,2) and p.id is null group by a.id;

set session join_buffer_size=256*1024;

select a.id, count(*) as n
from tbl as a
left join fp_tbl as f on f.id = a.id
left join fp_pol as p on p.lead_id   = f.id
where a.id in (1,2) and p.id is null
group by a.id;

set session join_buffer_size=1024*1024*1024;

select a.id, count(*) as n
from tbl as a
left join fp_tbl as f on f.id = a.id
left join fp_pol as p on p.lead_id   = f.id
where a.id in (1,2) and p.id is null
group by a.id;

select version();

----------------

Last outputs:
-----------------
mysql> explain select a.id, count(*) as n from tbl as a left join fp_tbl as f on f.id = a.id left join fp_pol as p on p.lead_id = f.id where a. id in (1,2) and p.id is null group by a.id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+----------+----------+--------------------------------------------------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref       | rows     | filtered | Extra                                                  |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+----------+----------+--------------------------------------------------------+
|  1 | SIMPLE      | a     | NULL       | range  | PRIMARY       | PRIMARY | 4       | NULL      |        2 |   100.00 | Using where; Using index; Using temporary              |
|  1 | SIMPLE      | f     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.a.id |        1 |   100.00 | Using index                                            |
|  1 | SIMPLE      | p     | NULL       | ALL    | NULL          | NULL    | NULL    | NULL      | 19461806 |    10.00 | Using where; Not exists; Using join buffer (hash join) |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+----------+----------+--------------------------------------------------------+
3 rows in set, 1 warning (0.00 sec)

mysql>
mysql>
mysql> set session join_buffer_size=256*1024;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql> select a.id, count(*) as n
    -> from tbl as a
    -> left join fp_tbl as f on f.id = a.id
    -> left join fp_pol as p on p.lead_id   = f.id
    -> where a.id in (1,2) and p.id is null
    -> group by a.id;
Empty set (3.28 sec)

mysql>
mysql> set session join_buffer_size=1024*1024*1024;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql> select a.id, count(*) as n
    -> from tbl as a
    -> left join fp_tbl as f on f.id = a.id
    -> left join fp_pol as p on p.lead_id   = f.id
    -> where a.id in (1,2) and p.id is null
    -> group by a.id;
+----+---+
| id | n |
+----+---+
|  1 | 1 |
|  2 | 1 |
+----+---+
2 rows in set (8.34 sec)

mysql>
mysql>
mysql> select version();
+-------------------+
| version()         |
+-------------------+
| 8.0.40-commercial |
+-------------------+
1 row in set (0.00 sec)

-- 
Shane, MySQL Senior Principal Technical Support Engineer
Oracle Corporation
http://dev.mysql.com/
[24 Oct 2024 15:36] Aaditya Dubey
Hi Shane,

That’s correct. Thank you for repeating the issue.
[24 Oct 2024 16:08] MySQL Verification Team
Hi Mr. Dubey,

We repeated the test case.

This is now a verified bug report.

Since the remedy is known, this is a low severity bug report, affecting version 8.0 and higher.
[25 Oct 2024 7:45] Knut Anders Hatlen
Is this perhaps a duplicate of bug#116334? The queries in the two reports look identical.
[25 Oct 2024 9:18] Aaditya Dubey
Hi Knut,

This bug report is a duplicate of https://bugs.mysql.com/bug.php?id=116334; interestingly, the MySQL team verified the issue with the same test case in bug#116334, but in this report they couldn’t. :)
[25 Oct 2024 9:40] MySQL Verification Team
Hi,

First of all, we are increasing the severity of the bug report.

Second, the reason why the original bug was easy repeated is the fact that it was not the same release. 8.0.40 had many bugs fixed.