Description:
When a query is rewritten using ROW_NUMBER() window functions to align two tables into a derived table, the same WHERE condition LEAST(0.6155687134269684, 0.6080817381701805) = c1 (where c1 is BIGINT) that correctly matches a row in the base table fails to match in the rewritten query, causing the aggregate result to change from 1 to NULL.
In the original query, the optimizer converts the floating‑point constant to BIGINT (0.608... → 1) to leverage an index, successfully matching c1=1. In the derived table version, because the column comes from a materialized CTE and cannot use an index, the optimizer may instead convert the BIGINT column to DOUBLE (1 → 1.0) for comparison, which does not equal 0.608....
This demonstrates an inconsistency in the optimizer’s choice of implicit conversion direction depending on the execution path, which is a correctness bug.
Results
Query Actual Result (ref0) Status
Base table (FROM s) 1 ✓
ROW_NUMBER derived table NULL ✗
EXPLAIN Analysis (Derived Table Version – Key Part)
Aggregate: sum(rq.c1)
-> Nested loop inner join
-> Filter: (<cache>(least(0.6155687134269684,0.6080817381701805)) = rq.c1)
-> Index lookup on lq using <auto_key0> (rn = rq.rn)
The original query plan likely includes an Index lookup or Range scan using the index u0(c1, c2) to perform an equality lookup. The optimizer converts the constant 0.6080817381701805 to BIGINT (yielding 1) to match c1=1.
In the rewritten query, after materialization, the c1 column comes from rq. The Filter compares LEAST(...) = rq.c1. Here, the optimizer may convert rq.c1 (BIGINT) to DOUBLE for comparison, resulting in 1 → 1.0, which does not equal 0.608....
How to repeat:
DROP DATABASE IF EXISTS codex_db10;
CREATE DATABASE codex_db10;
USE codex_db10;
SET SESSION sql_mode =
'ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION';
SET SESSION optimizer_switch =
'index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on,use_invisible_indexes=off,skip_scan=on,hash_join=on,subquery_to_derived=off,prefer_ordering_index=on,hypergraph_optimizer=off,derived_condition_pushdown=on,hash_set_operations=on';
SET SESSION big_tables = 0;
SET SESSION internal_tmp_mem_storage_engine = 'TempTable';
CREATE TABLE s(
vp_rowid BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
c0 DECIMAL(65,30) NULL,
c1 BIGINT NULL,
c2 FLOAT NULL,
c3 BIGINT NULL,
UNIQUE KEY u0(c1, c2)
) ENGINE=InnoDB;
INSERT INTO s(c0, c1, c2, c3) VALUES
(NULL, NULL, NULL, NULL),
(0E-30, 1, 0.0, 0),
(1528489958.000000000000000000000000000000, 0, NULL, NULL);
CREATE TABLE l(
vp_rowid BIGINT NOT NULL PRIMARY KEY,
c2 FLOAT NULL
) ENGINE=InnoDB;
CREATE TABLE r(
vp_rowid BIGINT NOT NULL PRIMARY KEY,
c1 BIGINT NULL,
c3 BIGINT NULL,
c0 DECIMAL(65,30) NULL
) ENGINE=InnoDB;
INSERT INTO l(vp_rowid, c2)
SELECT vp_rowid, c2 FROM s;
INSERT INTO r(vp_rowid, c1, c3, c0)
SELECT vp_rowid, c1, c3, c0 FROM s;
-- Original query (correct): returns 1
SELECT SUM(c1) AS ref0
FROM s
WHERE LEAST(0.6155687134269684, 0.6080817381701805) = c1;
-- Rewritten query (incorrect): returns NULL
SELECT SUM(d.c1) AS ref0
FROM (
WITH lq AS (
SELECT ROW_NUMBER() OVER (ORDER BY vp_rowid) AS rn, vp_rowid, c2
FROM l
),
rq AS (
SELECT ROW_NUMBER() OVER (ORDER BY vp_rowid) AS rn, vp_rowid, c1, c3, c0
FROM r
)
SELECT
lq.vp_rowid AS vp_rowid,
rq.c0 AS c0,
rq.c1 AS c1,
lq.c2 AS c2,
rq.c3 AS c3
FROM lq
JOIN rq ON lq.rn = rq.rn
) d
WHERE LEAST(0.6155687134269684, 0.6080817381701805) = d.c1;
Description: When a query is rewritten using ROW_NUMBER() window functions to align two tables into a derived table, the same WHERE condition LEAST(0.6155687134269684, 0.6080817381701805) = c1 (where c1 is BIGINT) that correctly matches a row in the base table fails to match in the rewritten query, causing the aggregate result to change from 1 to NULL. In the original query, the optimizer converts the floating‑point constant to BIGINT (0.608... → 1) to leverage an index, successfully matching c1=1. In the derived table version, because the column comes from a materialized CTE and cannot use an index, the optimizer may instead convert the BIGINT column to DOUBLE (1 → 1.0) for comparison, which does not equal 0.608.... This demonstrates an inconsistency in the optimizer’s choice of implicit conversion direction depending on the execution path, which is a correctness bug. Results Query Actual Result (ref0) Status Base table (FROM s) 1 ✓ ROW_NUMBER derived table NULL ✗ EXPLAIN Analysis (Derived Table Version – Key Part) Aggregate: sum(rq.c1) -> Nested loop inner join -> Filter: (<cache>(least(0.6155687134269684,0.6080817381701805)) = rq.c1) -> Index lookup on lq using <auto_key0> (rn = rq.rn) The original query plan likely includes an Index lookup or Range scan using the index u0(c1, c2) to perform an equality lookup. The optimizer converts the constant 0.6080817381701805 to BIGINT (yielding 1) to match c1=1. In the rewritten query, after materialization, the c1 column comes from rq. The Filter compares LEAST(...) = rq.c1. Here, the optimizer may convert rq.c1 (BIGINT) to DOUBLE for comparison, resulting in 1 → 1.0, which does not equal 0.608.... How to repeat: DROP DATABASE IF EXISTS codex_db10; CREATE DATABASE codex_db10; USE codex_db10; SET SESSION sql_mode = 'ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION'; SET SESSION optimizer_switch = 'index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on,use_invisible_indexes=off,skip_scan=on,hash_join=on,subquery_to_derived=off,prefer_ordering_index=on,hypergraph_optimizer=off,derived_condition_pushdown=on,hash_set_operations=on'; SET SESSION big_tables = 0; SET SESSION internal_tmp_mem_storage_engine = 'TempTable'; CREATE TABLE s( vp_rowid BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY, c0 DECIMAL(65,30) NULL, c1 BIGINT NULL, c2 FLOAT NULL, c3 BIGINT NULL, UNIQUE KEY u0(c1, c2) ) ENGINE=InnoDB; INSERT INTO s(c0, c1, c2, c3) VALUES (NULL, NULL, NULL, NULL), (0E-30, 1, 0.0, 0), (1528489958.000000000000000000000000000000, 0, NULL, NULL); CREATE TABLE l( vp_rowid BIGINT NOT NULL PRIMARY KEY, c2 FLOAT NULL ) ENGINE=InnoDB; CREATE TABLE r( vp_rowid BIGINT NOT NULL PRIMARY KEY, c1 BIGINT NULL, c3 BIGINT NULL, c0 DECIMAL(65,30) NULL ) ENGINE=InnoDB; INSERT INTO l(vp_rowid, c2) SELECT vp_rowid, c2 FROM s; INSERT INTO r(vp_rowid, c1, c3, c0) SELECT vp_rowid, c1, c3, c0 FROM s; -- Original query (correct): returns 1 SELECT SUM(c1) AS ref0 FROM s WHERE LEAST(0.6155687134269684, 0.6080817381701805) = c1; -- Rewritten query (incorrect): returns NULL SELECT SUM(d.c1) AS ref0 FROM ( WITH lq AS ( SELECT ROW_NUMBER() OVER (ORDER BY vp_rowid) AS rn, vp_rowid, c2 FROM l ), rq AS ( SELECT ROW_NUMBER() OVER (ORDER BY vp_rowid) AS rn, vp_rowid, c1, c3, c0 FROM r ) SELECT lq.vp_rowid AS vp_rowid, rq.c0 AS c0, rq.c1 AS c1, lq.c2 AS c2, rq.c3 AS c3 FROM lq JOIN rq ON lq.rn = rq.rn ) d WHERE LEAST(0.6155687134269684, 0.6080817381701805) = d.c1;