Bug #118512 The missing semi join condition causes incorrect result
Submitted: 24 Jun 2025 7:52 Modified: 29 Sep 2025 17:13
Reporter: Jingqi Tian (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.0.42, 8.4.5, 9.3.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution

[24 Jun 2025 7:52] Jingqi Tian
Description:
When semi join is executed using materialization, the semi join condition is missing.

How to repeat:
1. Create table and view

CREATE TABLE t1 (id INT, col1 INT);
CREATE TABLE t2 (id INT, col1 INT);
CREATE VIEW v1 AS SELECT abs(t1.id) AS id, t1.col1 AS col1 FROM t1 LEFT JOIN t2 ON t1.id = t2.id;

2. Insert data

INSERT INTO t1 VALUES (1, 1), (2, 2);
INSERT INTO t2 VALUES (1, 1), (2, 2);

3. Set optimizer_switch

SET optimizer_switch='loosescan=off,firstmatch=off,duplicateweedout=off';

4. Execute SQL:

SELECT /*+ JOIN_SUFFIX(t1, t2) */ * FROM v1 WHERE v1.id IN (SELECT id FROM v1);

We get result:

+------+------+
| id   | col1 |
+------+------+
|    1 |    1 |
|    1 |    1 |
|    2 |    2 |
|    2 |    2 |
+------+------+

But the correct result is:

+------+------+
| id   | col1 |
+------+------+
|    1 |    1 |
|    2 |    2 |
+------+------+
[24 Jun 2025 10:17] MySQL Verification Team
Hello Jingqi Tian,

Thank you for the report and test case.
Verified as described. 

regards,
Umesh
[25 Jun 2025 7:12] Jingqi Tian
Hello, Verification Team, this is my bugfix of bug#118512 based on 8.0.42.

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug#118512.diff (application/octet-stream, text), 3.87 KiB.

[25 Jun 2025 8:13] MySQL Verification Team
Thank you for your Contribution.

regards,
Umesh
[29 Sep 2025 17:13] Edward Gilmore
Posted by developer:
 
Added the following note to the MySQL Server 9.5.0 release notes:
	
Queries using semi join with materialization may return incorrect results due to a missing condition.
Our thanks to Jingqi Tian and the team at Alibaba.
[29 Sep 2025 17:29] Edward Gilmore
Posted by developer:
 
Added the following note to the MySQL Server 9.5.0 release notes:
	
 Queries with cyclic joins and equi-join predicates referencing the same tables could
 exhibit unnecessary complexity during planning due to non-equality predicates not 
 being pushed to existing cycle edges.
[21 Jan 18:53] Jean-François Gagné
From what I see, this bug is fixed in 9.6.0, but not in 8.4.8 nor in 8.0.45 (see test results below).

Can we hope for a back-port of the fix in 8.4 and 8.0 ?

For reference, this looks like a simple fix, see related commit below and quote of the 9.5.0 release notes linking Bug#38110792 with this bug.  Also, the contributed fix was for 8.0, unclear why this was only fixed in 9.5.

https://github.com/mysql/mysql-server/commit/a1a447aec8869b4b256b25247dd39c11f22fd88c

https://dev.mysql.com/doc/relnotes/mysql/9.5/en/news-9-5-0.html

> Queries using semi join with materialization may return incorrect results due to a missing condition.
>
> Our thanks to Jingqi Tian and the team at Alibaba. (Bug #38110792)

sql="
DROP TABLE IF EXISTS t1, t2;
CREATE TABLE t1 (id INT, col1 INT);
CREATE TABLE t2 (id INT, col1 INT);
DROP VIEW IF EXISTS v1;
CREATE VIEW v1 AS SELECT abs(t1.id) AS id, t1.col1 AS col1 FROM t1 LEFT JOIN t2 ON t1.id = t2.id;
INSERT INTO t1 VALUES (1, 1), (2, 2);
INSERT INTO t2 VALUES (1, 1), (2, 2);
SET optimizer_switch='loosescan=off,firstmatch=off,duplicateweedout=off';
SELECT /*+ JOIN_SUFFIX(t1, t2) */ * FROM v1 WHERE v1.id IN (SELECT id FROM v1);"

for v in 9_6_0 8_4_8 8_0_45; do echo; echo $v; ./msb_mysql_$v/use test <<< "$sql"; done | tail -n +2
9_6_0
id      col1
1       1
2       2

8_4_8
id      col1
1       1
1       1
2       2
2       2

8_0_45
id      col1
1       1
1       1
2       2
2       2
[21 Jan 21:01] Jean-François Gagné
Adding for completeness that Bug#119630 was closed as a duplicate of this bug.