Description:
These two queries are equivalent because they implement the same logical filtering using different syntactic forms. The original query uses NOT IN to check whether l_shipmode is not in the set (781812915, lineitem.l_commitdate, -1313533185), combined with a double negation (NOT and is not false) to filter matching rows. The rewritten query explicitly expands the NOT IN logic into multiple OR conditions, which separately evaluate:
Whether l_shipmode equals 781812915
Whether CAST(l_shipmode AS DATE) equals lineitem.l_commitdate
Whether CAST(l_shipmode AS CHAR(10)) equals -1313533185
While these two queries are logically equivalent, the original query returns 192 rows, whereas the rewritten query returns 2847 rows, indicating a bug.
To identify the issue, we removed the first CAST by replacing CAST(l_shipmode AS DATE) with l_shipmode. After this change, l_shipmode was implicitly converted to the DATE type, and both queries returned 192 rows, showing that the bug disappeared. Therefore, although implicit type conversion still converts l_shipmode to the DATE type, using explicit type conversion causes the error.
How to repeat:
This link provides the exported sql file for the database:
https://github.com/LLuopeiqi/newtpcd/blob/main/mysql/tpcd.sql
--original
select distinct lineitem.l_receiptdate, lineitem.l_returnflag from lineitem
where ((not (lineitem.l_shipmode not in
(781812915, lineitem.l_commitdate, -1313533185)))) is not false ;
return 192 rows
--rewritten
SELECT distinct l_receiptdate, l_returnflag FROM lineitem
WHERE ( l_shipmode = 781812915 OR
CAST(l_shipmode AS DATE) = l_commitdate OR
CAST(l_shipmode AS char(10)) = -1313533185 ) is not false ;
return 2847 rows