Description:
When performing CAST(column AS DECIMAL(65,30)) on a TEXT column that contains a value exceeding the range of DECIMAL(65,30) (e.g., '1e500'), the behavior differs between a direct table scan and a view/join path.
In the direct table scan, the cast returns an abnormally large value that appears to include data from other rows (suggesting memory corruption), leading to incorrect aggregate results (e.g., COUNT, SUM(DISTINCT ...)).
In the view/join path, the same cast is handled correctly (returning NULL or saturation to the maximum decimal), producing correct aggregates.
Results
| 路径 | ref0 | ref1 | ref2 |
| :--------- | :--- | :--- | :-------------------------------------------------------------------- |
| **single** | 2 | 0 | `100000000000000000000000001930040182.999999999999999999999999999999` |
| **split** | 2 | 0 | `99999999999999999999999999999999999.999999999999999999999999999999` |
EXPLAIN Analysis
Single table plan (incorrect)
-> Group aggregate:
count(...),
sum(distinct cast(s.c0 as decimal(65,30)))
-> Sort: ref1
-> Table scan on s
Direct table scan; the CAST is executed during aggregation, possibly without proper safety checks, leading to memory corruption.
View/Join plan (correct)
-> Sort with duplicate removal: ref0, ref1, ref2
-> Stream results
-> Group aggregate:
count(tmp_field),
sum(distinct tmp_field)
-> Sort: group expression
-> Stream results
-> Inner hash join (r.vp_rowid = l.vp_rowid)
-> Table scan on r
-> Hash
-> Table scan on l
The join introduces temporary fields; data is serialized/materialized, triggering a more robust conversion path with overflow protection.
Root Cause Analysis
Undefined behavior on overflow: According to SQL standard, CAST(string AS DECIMAL(M,D)) should either error, return NULL, or saturate when the input is out of range (depending on sql_mode). However, in the direct‑scan path, the conversion function may not correctly detect overflow and instead returns garbage from memory, possibly reading adjacent data (e.g., the value from the second row '1930040183' appears corrupted into 1930040182).
How to repeat:
DROP DATABASE IF EXISTS repro620_min;
CREATE DATABASE repro620_min;
USE repro620_min;
CREATE TABLE s (
vp_rowid BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
c0 TEXT NULL,
c1 TEXT NULL
) ENGINE=InnoDB;
INSERT INTO s(c0, c1) VALUES
('1e500', NULL),
('1930040183', NULL);
CREATE TABLE l (
vp_rowid BIGINT NOT NULL PRIMARY KEY,
c0 TEXT NULL
) ENGINE=InnoDB;
CREATE TABLE r (
vp_rowid BIGINT NOT NULL PRIMARY KEY,
c1 TEXT NULL
) ENGINE=InnoDB;
INSERT INTO l SELECT vp_rowid, c0 FROM s;
INSERT INTO r SELECT vp_rowid, c1 FROM s;
ALTER TABLE r
ADD CONSTRAINT fk FOREIGN KEY (vp_rowid) REFERENCES l(vp_rowid);
CREATE VIEW v AS
SELECT l.vp_rowid AS vp_rowid, l.c0 AS c0, r.c1 AS c1
FROM l JOIN r ON l.vp_rowid = r.vp_rowid;
-- 单表查询(异常结果)
SELECT DISTINCT
'single' AS q,
COUNT((GREATEST(s.c0, '')) && (0.980680600517849)) AS ref0,
(+(((s.c1) IS FALSE) &&
((CASE 0.03346139528957415 WHEN s.c0 THEN s.vp_rowid ELSE 1446123744 END)))) AS ref1,
SUM(DISTINCT CAST(s.c0 AS DECIMAL(65,30))) AS ref2
FROM s
GROUP BY (+(((s.c1) IS FALSE) &&
((CASE 0.03346139528957415 WHEN s.c0 THEN s.vp_rowid ELSE 1446123744 END))));
-- 视图/Join 查询(正确结果)
SELECT DISTINCT
'split' AS q,
COUNT((GREATEST(v.c0, '')) && (0.980680600517849)) AS ref0,
(+(((v.c1) IS FALSE) &&
((CASE 0.03346139528957415 WHEN v.c0 THEN v.vp_rowid ELSE 1446123744 END)))) AS ref1,
SUM(DISTINCT CAST(v.c0 AS DECIMAL(65,30))) AS ref2
FROM v
GROUP BY (+(((v.c1) IS FALSE) &&
((CASE 0.03346139528957415 WHEN v.c0 THEN v.vp_rowid ELSE 1446123744 END))));
Description: When performing CAST(column AS DECIMAL(65,30)) on a TEXT column that contains a value exceeding the range of DECIMAL(65,30) (e.g., '1e500'), the behavior differs between a direct table scan and a view/join path. In the direct table scan, the cast returns an abnormally large value that appears to include data from other rows (suggesting memory corruption), leading to incorrect aggregate results (e.g., COUNT, SUM(DISTINCT ...)). In the view/join path, the same cast is handled correctly (returning NULL or saturation to the maximum decimal), producing correct aggregates. Results | 路径 | ref0 | ref1 | ref2 | | :--------- | :--- | :--- | :-------------------------------------------------------------------- | | **single** | 2 | 0 | `100000000000000000000000001930040182.999999999999999999999999999999` | | **split** | 2 | 0 | `99999999999999999999999999999999999.999999999999999999999999999999` | EXPLAIN Analysis Single table plan (incorrect) -> Group aggregate: count(...), sum(distinct cast(s.c0 as decimal(65,30))) -> Sort: ref1 -> Table scan on s Direct table scan; the CAST is executed during aggregation, possibly without proper safety checks, leading to memory corruption. View/Join plan (correct) -> Sort with duplicate removal: ref0, ref1, ref2 -> Stream results -> Group aggregate: count(tmp_field), sum(distinct tmp_field) -> Sort: group expression -> Stream results -> Inner hash join (r.vp_rowid = l.vp_rowid) -> Table scan on r -> Hash -> Table scan on l The join introduces temporary fields; data is serialized/materialized, triggering a more robust conversion path with overflow protection. Root Cause Analysis Undefined behavior on overflow: According to SQL standard, CAST(string AS DECIMAL(M,D)) should either error, return NULL, or saturate when the input is out of range (depending on sql_mode). However, in the direct‑scan path, the conversion function may not correctly detect overflow and instead returns garbage from memory, possibly reading adjacent data (e.g., the value from the second row '1930040183' appears corrupted into 1930040182). How to repeat: DROP DATABASE IF EXISTS repro620_min; CREATE DATABASE repro620_min; USE repro620_min; CREATE TABLE s ( vp_rowid BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY, c0 TEXT NULL, c1 TEXT NULL ) ENGINE=InnoDB; INSERT INTO s(c0, c1) VALUES ('1e500', NULL), ('1930040183', NULL); CREATE TABLE l ( vp_rowid BIGINT NOT NULL PRIMARY KEY, c0 TEXT NULL ) ENGINE=InnoDB; CREATE TABLE r ( vp_rowid BIGINT NOT NULL PRIMARY KEY, c1 TEXT NULL ) ENGINE=InnoDB; INSERT INTO l SELECT vp_rowid, c0 FROM s; INSERT INTO r SELECT vp_rowid, c1 FROM s; ALTER TABLE r ADD CONSTRAINT fk FOREIGN KEY (vp_rowid) REFERENCES l(vp_rowid); CREATE VIEW v AS SELECT l.vp_rowid AS vp_rowid, l.c0 AS c0, r.c1 AS c1 FROM l JOIN r ON l.vp_rowid = r.vp_rowid; -- 单表查询(异常结果) SELECT DISTINCT 'single' AS q, COUNT((GREATEST(s.c0, '')) && (0.980680600517849)) AS ref0, (+(((s.c1) IS FALSE) && ((CASE 0.03346139528957415 WHEN s.c0 THEN s.vp_rowid ELSE 1446123744 END)))) AS ref1, SUM(DISTINCT CAST(s.c0 AS DECIMAL(65,30))) AS ref2 FROM s GROUP BY (+(((s.c1) IS FALSE) && ((CASE 0.03346139528957415 WHEN s.c0 THEN s.vp_rowid ELSE 1446123744 END)))); -- 视图/Join 查询(正确结果) SELECT DISTINCT 'split' AS q, COUNT((GREATEST(v.c0, '')) && (0.980680600517849)) AS ref0, (+(((v.c1) IS FALSE) && ((CASE 0.03346139528957415 WHEN v.c0 THEN v.vp_rowid ELSE 1446123744 END)))) AS ref1, SUM(DISTINCT CAST(v.c0 AS DECIMAL(65,30))) AS ref2 FROM v GROUP BY (+(((v.c1) IS FALSE) && ((CASE 0.03346139528957415 WHEN v.c0 THEN v.vp_rowid ELSE 1446123744 END))));