Description:
I have encountered a scenario where a SELECT query with a specific predicate returns a result (a row containing NULL), but a logically equivalent query using a derived table returns an empty set.
The issue involves the interaction between NULLIF, IS NOT UNKNOWN (which is functionally equivalent to IS NOT NULL in this context), and NOT.
Logical Analysis:
Row Data: t0.c1 is NULL.
Expression: NOT ((NULLIF(0.3 IS NOT UNKNOWN, t0.c1)) IS NOT UNKNOWN)
Step-by-step evaluation:
0.3 IS NOT UNKNOWN evaluates to TRUE (or 1 in MySQL).
NULLIF(TRUE, t0.c1) where t0.c1 is NULL evaluates to TRUE (because TRUE != NULL).
TRUE IS NOT UNKNOWN evaluates to TRUE.
NOT TRUE evaluates to FALSE.
Expected Behavior: Since the predicate evaluates to FALSE, the query should return an empty set.
Actual Behavior:
The direct query returns the row (where c1 is NULL), which is incorrect.
The derived table query returns an empty set, which is correct.
This inconsistency indicates a bug in the MySQL optimizer, likely related to how predicates are pushed down or simplified when constant expressions and NULLIF are combined.
Cross-Database Comparison:
I have verified this behavior on other database engines:
MariaDB 11.8.5: Both queries return an empty set (Correct).
TiDB 8.4.5: Both queries return an empty set (Correct).
--------------------------------------------------------------------------
mysql> -- result : empty set
mysql> SELECT t0.c1 AS ref0 FROM t0 WHERE (NOT ((NULLIF(0.3 IS NOT UNKNOWN, t0.c1)) IS NOT UNKNOWN));
Empty set (0.00 sec)
mysql> -- cardinality: null
mysql> SELECT ref0 FROM (SELECT t0.c1 AS ref0, (NOT ((NULLIF(0.3 IS NOT UNKNOWN, t0.c1)) IS NOT UNKNOWN)) AS ref1 FROM t0) AS s WHERE ref1;
+------+
| ref0 |
+------+
| NULL |
+------+
1 row in set (0.00 sec)
mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 9.6.0 |
+-----------+
1 row in set (0.00 sec)
How to repeat:
CREATE TABLE t0(c0 INT, c1 FLOAT) ;
INSERT INTO t0(c1) VALUES(NULL);
-- result : empty set
SELECT t0.c1 AS ref0 FROM t0 WHERE (NOT ((NULLIF(0.3 IS NOT UNKNOWN, t0.c1)) IS NOT UNKNOWN));
-- cardinality: null
SELECT ref0 FROM (SELECT t0.c1 AS ref0, (NOT ((NULLIF(0.3 IS NOT UNKNOWN, t0.c1)) IS NOT UNKNOWN)) AS ref1 FROM t0) AS s WHERE ref1;
Description: I have encountered a scenario where a SELECT query with a specific predicate returns a result (a row containing NULL), but a logically equivalent query using a derived table returns an empty set. The issue involves the interaction between NULLIF, IS NOT UNKNOWN (which is functionally equivalent to IS NOT NULL in this context), and NOT. Logical Analysis: Row Data: t0.c1 is NULL. Expression: NOT ((NULLIF(0.3 IS NOT UNKNOWN, t0.c1)) IS NOT UNKNOWN) Step-by-step evaluation: 0.3 IS NOT UNKNOWN evaluates to TRUE (or 1 in MySQL). NULLIF(TRUE, t0.c1) where t0.c1 is NULL evaluates to TRUE (because TRUE != NULL). TRUE IS NOT UNKNOWN evaluates to TRUE. NOT TRUE evaluates to FALSE. Expected Behavior: Since the predicate evaluates to FALSE, the query should return an empty set. Actual Behavior: The direct query returns the row (where c1 is NULL), which is incorrect. The derived table query returns an empty set, which is correct. This inconsistency indicates a bug in the MySQL optimizer, likely related to how predicates are pushed down or simplified when constant expressions and NULLIF are combined. Cross-Database Comparison: I have verified this behavior on other database engines: MariaDB 11.8.5: Both queries return an empty set (Correct). TiDB 8.4.5: Both queries return an empty set (Correct). -------------------------------------------------------------------------- mysql> -- result : empty set mysql> SELECT t0.c1 AS ref0 FROM t0 WHERE (NOT ((NULLIF(0.3 IS NOT UNKNOWN, t0.c1)) IS NOT UNKNOWN)); Empty set (0.00 sec) mysql> -- cardinality: null mysql> SELECT ref0 FROM (SELECT t0.c1 AS ref0, (NOT ((NULLIF(0.3 IS NOT UNKNOWN, t0.c1)) IS NOT UNKNOWN)) AS ref1 FROM t0) AS s WHERE ref1; +------+ | ref0 | +------+ | NULL | +------+ 1 row in set (0.00 sec) mysql> SELECT VERSION(); +-----------+ | VERSION() | +-----------+ | 9.6.0 | +-----------+ 1 row in set (0.00 sec) How to repeat: CREATE TABLE t0(c0 INT, c1 FLOAT) ; INSERT INTO t0(c1) VALUES(NULL); -- result : empty set SELECT t0.c1 AS ref0 FROM t0 WHERE (NOT ((NULLIF(0.3 IS NOT UNKNOWN, t0.c1)) IS NOT UNKNOWN)); -- cardinality: null SELECT ref0 FROM (SELECT t0.c1 AS ref0, (NOT ((NULLIF(0.3 IS NOT UNKNOWN, t0.c1)) IS NOT UNKNOWN)) AS ref1 FROM t0) AS s WHERE ref1;