Description:
Hi team,
We have encountered a potential bug in MySQL where two logically equivalent queries return different results when the MyISAM storage engine is used.
The inconsistency occurs between a direct WHERE EXISTS (...) clause and a query where the EXISTS (...) result is first projected in a derived table and then filtered. Interestingly, if the engine is changed to InnoDB, both queries return consistent results.
mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.4.5-5 |
+-----------+
1 row in set (0.00 sec)
How to repeat:
-- Create tables
CREATE TABLE t1(c0 TINYTEXT);
-- The issue is specific to MyISAM
CREATE TABLE t2(c0 LONGTEXT) ENGINE = MyISAM;
-- Insert data
INSERT INTO t1(c0) VALUES(0);
INSERT INTO t1(c0) VALUES(NULL);
INSERT INTO t2(c0) VALUES('409271130');
-- Query 1: Direct EXISTS in WHERE clause
-- Expected: Both rows from t1 OR zero rows (depending on evaluation)
-- Result with MyISAM: Returns 2 rows (0 and NULL)
SELECT t1.c0 FROM t1
WHERE EXISTS (SELECT 1 FROM t2 WHERE (NULLIF(-1, t2.c0)) IS UNKNOWN);
-- Query 2: EXISTS projected in a derived table
-- This query is logically equivalent to Query 1
-- Result with MyISAM: Returns 0 rows (Cardinality 0)
SELECT ref0 FROM (
SELECT t1.c0 AS ref0,
EXISTS (SELECT 1 FROM t2 WHERE (NULLIF(-1, t2.c0)) IS UNKNOWN) AS ref1
FROM t1
) AS s
WHERE ref1;
Suggested fix:
Expected Behavior
Both queries should return the same number of rows. Since Query 2 is simply a wrapped version of Query 1 (using a derived table), their results must be identical regardless of the underlying storage engine.
Actual Behavior
With ENGINE = MyISAM:
Query 1 returns 2 rows.
Query 2 returns 0 rows.
With ENGINE = InnoDB:
Both queries return consistent results (0 rows, as NULLIF(-1, '409271130') returns -1, and -1 IS UNKNOWN is false).
Observations
The bug seems related to how the optimizer handles the IS UNKNOWN (or IS NULL) condition inside an EXISTS subquery when it's part of a projected column in a derived table versus a direct predicate. The fact that it only appears with MyISAM suggests an issue with how the optimizer interacts with engine-specific row evaluation or nullability flags during subquery transformation.
Description: Hi team, We have encountered a potential bug in MySQL where two logically equivalent queries return different results when the MyISAM storage engine is used. The inconsistency occurs between a direct WHERE EXISTS (...) clause and a query where the EXISTS (...) result is first projected in a derived table and then filtered. Interestingly, if the engine is changed to InnoDB, both queries return consistent results. mysql> SELECT VERSION(); +-----------+ | VERSION() | +-----------+ | 8.4.5-5 | +-----------+ 1 row in set (0.00 sec) How to repeat: -- Create tables CREATE TABLE t1(c0 TINYTEXT); -- The issue is specific to MyISAM CREATE TABLE t2(c0 LONGTEXT) ENGINE = MyISAM; -- Insert data INSERT INTO t1(c0) VALUES(0); INSERT INTO t1(c0) VALUES(NULL); INSERT INTO t2(c0) VALUES('409271130'); -- Query 1: Direct EXISTS in WHERE clause -- Expected: Both rows from t1 OR zero rows (depending on evaluation) -- Result with MyISAM: Returns 2 rows (0 and NULL) SELECT t1.c0 FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE (NULLIF(-1, t2.c0)) IS UNKNOWN); -- Query 2: EXISTS projected in a derived table -- This query is logically equivalent to Query 1 -- Result with MyISAM: Returns 0 rows (Cardinality 0) SELECT ref0 FROM ( SELECT t1.c0 AS ref0, EXISTS (SELECT 1 FROM t2 WHERE (NULLIF(-1, t2.c0)) IS UNKNOWN) AS ref1 FROM t1 ) AS s WHERE ref1; Suggested fix: Expected Behavior Both queries should return the same number of rows. Since Query 2 is simply a wrapped version of Query 1 (using a derived table), their results must be identical regardless of the underlying storage engine. Actual Behavior With ENGINE = MyISAM: Query 1 returns 2 rows. Query 2 returns 0 rows. With ENGINE = InnoDB: Both queries return consistent results (0 rows, as NULLIF(-1, '409271130') returns -1, and -1 IS UNKNOWN is false). Observations The bug seems related to how the optimizer handles the IS UNKNOWN (or IS NULL) condition inside an EXISTS subquery when it's part of a projected column in a derived table versus a direct predicate. The fact that it only appears with MyISAM suggests an issue with how the optimizer interacts with engine-specific row evaluation or nullability flags during subquery transformation.