Bug #120737 Inconsistent CAST Handling of Out-of-Range TEXT Values in Single Table vs. View/Join Paths
Submitted: 22 Jun 8:28 Modified: 22 Jun 10:31
Reporter: Annie liu Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S1 (Critical)
Version:8.0.45 OS:Any
Assigned to: CPU Architecture:Any

[22 Jun 8:28] Annie liu
Description:
When performing CAST(column AS DECIMAL(65,30)) on a TEXT column that contains a value exceeding the range of DECIMAL(65,30) (e.g., '1e500'), the behavior differs between a direct table scan and a view/join path.

In the direct table scan, the cast returns an abnormally large value that appears to include data from other rows (suggesting memory corruption), leading to incorrect aggregate results (e.g., COUNT, SUM(DISTINCT ...)).

In the view/join path, the same cast is handled correctly (returning NULL or saturation to the maximum decimal), producing correct aggregates.

Results

| 路径         | ref0 | ref1 | ref2                                                                  |
| :--------- | :--- | :--- | :-------------------------------------------------------------------- |
| **single** | 2    | 0    | `100000000000000000000000001930040182.999999999999999999999999999999` |
| **split**  | 2    | 0    | `99999999999999999999999999999999999.999999999999999999999999999999`  |

EXPLAIN Analysis

Single table plan (incorrect)

-> Group aggregate:
     count(...),
     sum(distinct cast(s.c0 as decimal(65,30)))
    -> Sort: ref1
        -> Table scan on s
Direct table scan; the CAST is executed during aggregation, possibly without proper safety checks, leading to memory corruption.

View/Join plan (correct)

-> Sort with duplicate removal: ref0, ref1, ref2
    -> Stream results
        -> Group aggregate:
             count(tmp_field),
             sum(distinct tmp_field)
            -> Sort: group expression
                -> Stream results
                    -> Inner hash join (r.vp_rowid = l.vp_rowid)
                        -> Table scan on r
                        -> Hash
                            -> Table scan on l
The join introduces temporary fields; data is serialized/materialized, triggering a more robust conversion path with overflow protection.

Root Cause Analysis

Undefined behavior on overflow: According to SQL standard, CAST(string AS DECIMAL(M,D)) should either error, return NULL, or saturate when the input is out of range (depending on sql_mode). However, in the direct‑scan path, the conversion function may not correctly detect overflow and instead returns garbage from memory, possibly reading adjacent data (e.g., the value from the second row '1930040183' appears corrupted into 1930040182).

How to repeat:
DROP DATABASE IF EXISTS repro620_min;
CREATE DATABASE repro620_min;
USE repro620_min;

CREATE TABLE s (
  vp_rowid BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
  c0 TEXT NULL,
  c1 TEXT NULL
) ENGINE=InnoDB;

INSERT INTO s(c0, c1) VALUES
  ('1e500', NULL),
  ('1930040183', NULL);

CREATE TABLE l (
  vp_rowid BIGINT NOT NULL PRIMARY KEY,
  c0 TEXT NULL
) ENGINE=InnoDB;

CREATE TABLE r (
  vp_rowid BIGINT NOT NULL PRIMARY KEY,
  c1 TEXT NULL
) ENGINE=InnoDB;

INSERT INTO l SELECT vp_rowid, c0 FROM s;
INSERT INTO r SELECT vp_rowid, c1 FROM s;

ALTER TABLE r
  ADD CONSTRAINT fk FOREIGN KEY (vp_rowid) REFERENCES l(vp_rowid);

CREATE VIEW v AS
SELECT l.vp_rowid AS vp_rowid, l.c0 AS c0, r.c1 AS c1
FROM l JOIN r ON l.vp_rowid = r.vp_rowid;

-- 单表查询(异常结果)
SELECT DISTINCT
  'single' AS q,
  COUNT((GREATEST(s.c0, '')) && (0.980680600517849)) AS ref0,
  (+(((s.c1) IS FALSE) &&
     ((CASE 0.03346139528957415 WHEN s.c0 THEN s.vp_rowid ELSE 1446123744 END)))) AS ref1,
  SUM(DISTINCT CAST(s.c0 AS DECIMAL(65,30))) AS ref2
FROM s
GROUP BY (+(((s.c1) IS FALSE) &&
       ((CASE 0.03346139528957415 WHEN s.c0 THEN s.vp_rowid ELSE 1446123744 END))));

-- 视图/Join 查询(正确结果)
SELECT DISTINCT
  'split' AS q,
  COUNT((GREATEST(v.c0, '')) && (0.980680600517849)) AS ref0,
  (+(((v.c1) IS FALSE) &&
     ((CASE 0.03346139528957415 WHEN v.c0 THEN v.vp_rowid ELSE 1446123744 END)))) AS ref1,
  SUM(DISTINCT CAST(v.c0 AS DECIMAL(65,30))) AS ref2
FROM v
GROUP BY (+(((v.c1) IS FALSE) &&
       ((CASE 0.03346139528957415 WHEN v.c0 THEN v.vp_rowid ELSE 1446123744 END))));
[22 Jun 10:31] Roy Lyseng
Thank you for the bug report.
Verified as described.