Bug #120782 Derived merge: temporal expr ignores NULL-extended LEFT JOIN row
Submitted: 25 Jun 20:24 Modified: 26 Jun 4:47
Reporter: Yakir Gibraltar (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:9.7.0 OS:Linux
Assigned to: CPU Architecture:Any

[25 Jun 20:24] Yakir Gibraltar
Description:
After derived_merge (default ON), a merged derived table on the inner side of a LEFT JOIN that produces no rows is correctly displayed as NULL, but temporal expressions that read that column via Item_ref::val_date_temporal() bypass Item_view_ref::has_null_row() and instead evaluate the underlying derived constant. Temporal comparisons, BETWEEN, IN, LEAST()/GREATEST(), and MIN()/MAX() then treat a hidden TIMESTAMP literal as a real non-NULL value. With derived_merge=off (materialized) the same queries correctly return NULL.

This is a residual case of Bug#38191248 (public Bug #118658), which fixed the ordinary temporal accessors on Item_view_ref (e.g. COALESCE via val_datetime()) but missed val_date_temporal(). COALESCE on the same column is now fixed on 9.7.0; the comparison / IN / BETWEEN / LEAST / GREATEST / MIN / MAX paths are not.

Impact: changes projected values and admits/filters rows for ordinary LEFT JOIN queries -- e.g. "WHERE ttt.b <> t.d", "NOT(ttt.b = t.d)", "ttt.b BETWEEN ...", "ttt.b IN (...)", "WHERE LEAST(ttt.b,t.d) = ...". Temporal MIN()/MAX() return a value while COUNT() over the same column is 0.

Source: Item_view_ref does not override val_date_temporal(); the inherited Item_ref::val_date_temporal() (sql/item.cc) calls ref_item()->val_date_temporal() and copies null_value, skipping has_null_row(). Consumers: get_datetime_value() (sql/item_cmpfunc.cc) for datetime comparison, datetime-mode BETWEEN, and datetime IN; Item_func_min_max::cmp_datetimes() (sql/item_func.cc) for LEAST()/GREATEST(); temporal MIN()/MAX() (sql/item_sum.cc).

How to repeat:
SELECT VERSION();   -- 9.7.0

SET optimizer_switch='derived_merge=on';
SELECT ttt.b,
       LEAST(ttt.b, t.d)    AS least_v,
       GREATEST(ttt.b, t.d) AS greatest_v
FROM (SELECT DATE'2025-01-01' AS d) AS t
LEFT JOIN (SELECT TIMESTAMP'2024-01-03 23:59:59' AS b
           FROM (SELECT 1) AS x WHERE FALSE) AS ttt ON TRUE;
-- 9.7.0 (derived_merge=on): b=NULL, least_v='2024-01-03 23:59:59', greatest_v='2025-01-01 00:00:00'  [WRONG]

SET optimizer_switch='derived_merge=off';
-- same query: b=NULL, least_v=NULL, greatest_v=NULL  [correct]

-- Literal control (correct on both): LEAST(NULL, DATE'2025-01-01') and GREATEST(NULL, DATE'2025-01-01') are NULL.

-- Predicate / row-set impact (derived_merge=on):
SELECT ttt.b = t.d  AS eq,
       ttt.b <> t.d AS ne,
       ttt.b BETWEEN TIMESTAMP'2024-01-01 00:00:00' AND TIMESTAMP'2024-12-31 23:59:59' AS betw,
       ttt.b IN (TIMESTAMP'2024-01-03 23:59:59') AS in_b
FROM (SELECT DATE'2025-01-01' AS d) AS t
LEFT JOIN (SELECT TIMESTAMP'2024-01-03 23:59:59' AS b
           FROM (SELECT 1) AS x WHERE FALSE) AS ttt ON TRUE;
-- 9.7.0 -> 0, 1, 1, 1      correct (and derived_merge=off) -> NULL, NULL, NULL, NULL

-- Aggregate impact:
SELECT MIN(ttt.b) AS min_b, MAX(ttt.b) AS max_b, COUNT(ttt.b) AS cnt
FROM (SELECT 1 AS id, DATE'2025-01-01' AS d) AS t
LEFT JOIN (SELECT TIMESTAMP'2024-01-03 23:59:59' AS b
           FROM (SELECT 1) AS x WHERE FALSE) AS ttt ON TRUE;
-- 9.7.0 -> min_b=max_b='2024-01-03 23:59:59' while cnt=0   correct -> NULL, NULL, 0

Suggested fix:
Override val_date_temporal() (and any other packed-temporal fast-path accessors) on Item_view_ref so it honors has_null_row() and returns NULL for a NULL-extended outer-join row, mirroring the existing Item_view_ref::val_datetime()/val_date()/val_time() fix from Bug#38191248. Alternatively, route get_datetime_value() and the temporal MIN/MAX paths through val_datetime() rather than val_date_temporal() for Item_view_ref.
[26 Jun 4:47] Chaithra Marsur Gopala Reddy
Hi Yakir Gibraltar,

Thank you for the test case. Verified as described. And thank you for the suggestion as well.