Description:
These two queries are logically equivalent because they both implement the same filtering condition: comparing l_commitdate with the value of l_shipmode while ensuring that NULL handling is explicitly defined.
In the original query, COALESCE(null, lineitem.l_shipmode) evaluates to lineitem.l_shipmode because COALESCE returns the first non-NULL value. Since the first argument is NULL, the result is effectively equivalent to lineitem.l_shipmode.
In the rewritten query, the CASE WHEN statement explicitly handles NULL by checking whether NULL is not NULL (always false) and returning CAST(l_shipmode AS char(10)) in the ELSE branch. This result is also equivalent to lineitem.l_shipmode.
Both queries evaluate the condition l_commitdate > l_shipmode, with identical logic for handling NULL values. But they return different results, indicating the presence of a bug.
How to repeat:
This link provides the exported sql file for the database:
https://github.com/LLuopeiqi/newtpcd/blob/main/mysql/tpcd.sql
--original
select distinct lineitem.l_shipinstruct, lineitem.l_linestatus from lineitem
where (lineitem.l_commitdate) > (coalesce(null, lineitem.l_shipmode)) ;
return 10 rows
--rewritten
SELECT distinct l_shipinstruct , l_linestatus FROM lineitem
WHERE l_commitdate > CASE WHEN NULL IS NOT NULL
THEN NULL
ELSE CAST(l_shipmode AS char(10)) END ;
return 0 row