Bug #120121 Phantom Drift: Wrong Result in INTERSECT with NULL caused by Prefix Primary Key and Cartesian Join
Submitted: 20 Mar 4:51 Modified: 20 Mar 9:41
Reporter: Seren Zhou Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S1 (Critical)
Version:9.5.0 OS:Any
Assigned to: CPU Architecture:Any

[20 Mar 4:51] Seren Zhou
Description:
A severe memory misalignment or deduplication hashing flaw occurs when evaluating an INTERSECT operator with a NULL literal. The query incorrectly returns a phantom string from the left operand instead of an empty set.

Crucially, the specific string returned (the phantom data) drifts and changes depending on the underlying physical join algorithm (Hash Join vs. Nested Loop Join).

Extensive differential testing isolates the trigger to a specific combination of schema and logical factors:

The target column must have a Prefix Primary Key (e.g., PRIMARY KEY (c2(3))). Removing this prefix constraint completely mitigates the bug.

The left operand of the INTERSECT must result from a cartesian product (join without condition) involving a derived table that uses GROUP BY.

How to repeat:
-- 1. Setup Schema and Data
DROP TABLE IF EXISTS t62, t63, ta1, temp;

CREATE TABLE t62 (
    c1 MEDIUMBLOB,
    c2 MEDIUMTEXT,
    PRIMARY KEY (c2(3)) -- The critical trigger constraint
);

CREATE TABLE t63 (
    c1 DECIMAL(20,0)
);

INSERT INTO t63 (c1) VALUES (-56850516), (-423045439110148), (243963924330621);
INSERT INTO t62 (c1,c2) VALUES
    ('gZRwb3Gff017Aty','Y9QdrTnb1qc9HWtgp4wBDq9'),
    ('r6m8ihHPU9xlDaBEdOeF0kPVxR','7DCLHupgThhGY'),
    ('MAfQNO1M','REF2');

-- 2. Trigger the Bug (Default: Hash Join)
SELECT ca1 FROM (
    SELECT ta1.ca2 AS ca1 FROM t63 
    JOIN (SELECT c2 AS ca2 FROM t62 WHERE !(c1 IS NULL) GROUP BY c2) AS ta1
) AS derived_out
INTERSECT 
SELECT NULL;

-- Expected: Empty set
-- Actual: Returns 1 row containing '7DCLHupgThhGY' (False Positive)

-- 3. Trigger the Phantom Drift (Nested Loop Join)
SET SESSION optimizer_switch='block_nested_loop=off';

SELECT ca1 FROM (
    SELECT ta1.ca2 AS ca1 FROM t63 
    JOIN (SELECT c2 AS ca2 FROM t62 WHERE !(c1 IS NULL) GROUP BY c2) AS ta1
) AS derived_out
INTERSECT 
SELECT NULL;

-- Actual: Returns 1 row, but the phantom data drifted to 'Y9QdrTnb1qc9HWtgp4wBDq9'
[20 Mar 9:41] Roy Lyseng
Thank you for the bug report.
This is most likely a duplicate of bug#117911.
A fix for this bug will be delivered in release 9.7.