Description:
Summary
MySQL returns different results for two logically equivalent queries. The original grouped query returns one row (NULL), while the transformed query—obtained by relocating the HAVING predicate into the projection list of a derived table and filtering with WHERE ref1—returns an empty set. This appears to be a wrong-result bug caused by incorrect evaluation or rewriting of the relocated aggregate predicate.
Actual Result
Query 1 returns one row: NULL
Query 2 returns: Empty Set
Expected Result
Both queries should return the same result (NULL). The second query is only a structural rewriting of the first one: the original HAVING predicate is projected as a derived boolean column ref1, and the outer query filters with WHERE ref1. Since HAVING is the standard way to filter grouped rows by aggregate conditions in MySQL, relocating that condition into a derived table should preserve semantics.
Why this looks wrong
MAX([DISTINCT] expr) is documented to produce the same result with or without DISTINCT, so the aggregate predicate itself is unchanged by this rewriting.
Also, MySQL comparison expressions yield 1, 0, or NULL, and XOR is a defined operator in MySQL expressions, so the predicate in both queries is a normal boolean-valued expression under MySQL semantics.
Therefore, the following two forms should be equivalent:
... GROUP BY g HAVING P
and
SELECT * FROM (
SELECT ..., P AS ref1
FROM ...
GROUP BY g
) AS s
WHERE ref1
However, MySQL returns different results for these two forms, which indicates a wrong-result bug.
Execution Plans
Query 1:
EXPLAIN SELECT DISTINCT CONCAT(t1.c0) FROM t1
WHERE (- (((t1.c1) XOR (0.5609188265100435)) >= ((0.25130445811584556) != (1342228223))))
GROUP BY t1.c0
HAVING (MAX(DISTINCT t1.c1)) != (-1599144071);
Plan:
-> Sort with duplicate removal: `CONCAT(t1.c0)`
-> Filter: (max(t1.c1) <> <cache>(-(1599144071)))
-> Table scan on <temporary>
-> Aggregate using temporary table
-> Filter: (0 <> -((((0 <> t1.c1) xor <cache>((0 <> 0.5609188265100435))) >= <cache>((0.25130445811584556 <> 1342228223))))) (cost=0.65 rows=4)
-> Covering index scan on t1 using i1 (cost=0.65 rows=4)
Query 2:
EXPLAIN SELECT DISTINCT ref0
FROM (
SELECT CONCAT(t1.c0) AS ref0,
((MAX(DISTINCT t1.c1)) != (-1599144071)) AS ref1
FROM t1
WHERE (- (((t1.c1) XOR (0.5609188265100435)) >= ((0.25130445811584556) != (1342228223))))
GROUP BY t1.c0
) AS s
WHERE ref1;
Plan:
-> Table scan on <temporary> (cost=5.46..6.72 rows=2)
-> Temporary table with deduplication (cost=4.2..4.2 rows=2)
-> Table scan on s (cost=2.47..3.74 rows=2)
-> Materialize (cost=1.21..1.21 rows=2)
-> Filter: (0 <> (max(t1.c1) <> <cache>(-(1599144071)))) (cost=0.75 rows=2)
-> Filter: (0 <> -((((0 <> t1.c1) xor <cache>((0 <> 0.5609188265100435))) >= <cache>((0.25130445811584556 <> 1342228223))))) (cost=0.75 rows=2)
-> Covering index skip scan for grouping on t1 using i1 (cost=0.75 rows=2)
How to repeat:
CREATE TABLE t1(c0 DECIMAL , c1 FLOAT PRIMARY KEY UNIQUE KEY) ;
INSERT INTO t1(c0, c1) VALUES(NULL, 0.41472381328161134);
REPLACE INTO t1(c0, c1) VALUES(-1.689413262E9, 1037036926);
INSERT IGNORE INTO t1(c0, c1) VALUES(-547152157, 0.8763334995483548);
UPDATE t1 SET c0=DEFAULT;
UPDATE t1 SET c1=NULL WHERE (- (((-482097596) IN (NULL, '', -1312523974)) AND (NULL)));
INSERT IGNORE INTO t1(c0) VALUES(NULL), (NULL), ('-1123800899');
CREATE INDEX i1 USING BTREE ON t1(c0, c1 DESC) ALGORITHM INPLACE;
-- cardinality: 1: NULL
SELECT DISTINCT CONCAT(t1.c0) FROM t1 WHERE (- (((t1.c1) XOR (0.5609188265100435)) >= ((0.25130445811584556) != (1342228223)))) GROUP BY t1.c0 HAVING (MAX(DISTINCT t1.c1)) != (-1599144071);
-- cardinality: 0
SELECT DISTINCT ref0 FROM (SELECT CONCAT(t1.c0) AS ref0, ((MAX(DISTINCT t1.c1)) != (-1599144071)) AS ref1 FROM t1 WHERE (- (((t1.c1) XOR (0.5609188265100435)) >= ((0.25130445811584556) != (1342228223)))) GROUP BY t1.c0) AS s WHERE ref1;
Suggested fix:
Possible Cause
The issue seems related to optimizer rewriting of the relocated aggregate predicate after derived-table materialization. In the original query, the aggregate predicate remains a post-group filter. In the rewritten query, WHERE ref1 is transformed into a truth test over the derived boolean expression, and this transformation appears to change the result incorrectly. MySQL has had optimizer issues around XOR-related rewrites before, so this may be another wrong-result case triggered by expression simplification plus grouping/index access.
Description: Summary MySQL returns different results for two logically equivalent queries. The original grouped query returns one row (NULL), while the transformed query—obtained by relocating the HAVING predicate into the projection list of a derived table and filtering with WHERE ref1—returns an empty set. This appears to be a wrong-result bug caused by incorrect evaluation or rewriting of the relocated aggregate predicate. Actual Result Query 1 returns one row: NULL Query 2 returns: Empty Set Expected Result Both queries should return the same result (NULL). The second query is only a structural rewriting of the first one: the original HAVING predicate is projected as a derived boolean column ref1, and the outer query filters with WHERE ref1. Since HAVING is the standard way to filter grouped rows by aggregate conditions in MySQL, relocating that condition into a derived table should preserve semantics. Why this looks wrong MAX([DISTINCT] expr) is documented to produce the same result with or without DISTINCT, so the aggregate predicate itself is unchanged by this rewriting. Also, MySQL comparison expressions yield 1, 0, or NULL, and XOR is a defined operator in MySQL expressions, so the predicate in both queries is a normal boolean-valued expression under MySQL semantics. Therefore, the following two forms should be equivalent: ... GROUP BY g HAVING P and SELECT * FROM ( SELECT ..., P AS ref1 FROM ... GROUP BY g ) AS s WHERE ref1 However, MySQL returns different results for these two forms, which indicates a wrong-result bug. Execution Plans Query 1: EXPLAIN SELECT DISTINCT CONCAT(t1.c0) FROM t1 WHERE (- (((t1.c1) XOR (0.5609188265100435)) >= ((0.25130445811584556) != (1342228223)))) GROUP BY t1.c0 HAVING (MAX(DISTINCT t1.c1)) != (-1599144071); Plan: -> Sort with duplicate removal: `CONCAT(t1.c0)` -> Filter: (max(t1.c1) <> <cache>(-(1599144071))) -> Table scan on <temporary> -> Aggregate using temporary table -> Filter: (0 <> -((((0 <> t1.c1) xor <cache>((0 <> 0.5609188265100435))) >= <cache>((0.25130445811584556 <> 1342228223))))) (cost=0.65 rows=4) -> Covering index scan on t1 using i1 (cost=0.65 rows=4) Query 2: EXPLAIN SELECT DISTINCT ref0 FROM ( SELECT CONCAT(t1.c0) AS ref0, ((MAX(DISTINCT t1.c1)) != (-1599144071)) AS ref1 FROM t1 WHERE (- (((t1.c1) XOR (0.5609188265100435)) >= ((0.25130445811584556) != (1342228223)))) GROUP BY t1.c0 ) AS s WHERE ref1; Plan: -> Table scan on <temporary> (cost=5.46..6.72 rows=2) -> Temporary table with deduplication (cost=4.2..4.2 rows=2) -> Table scan on s (cost=2.47..3.74 rows=2) -> Materialize (cost=1.21..1.21 rows=2) -> Filter: (0 <> (max(t1.c1) <> <cache>(-(1599144071)))) (cost=0.75 rows=2) -> Filter: (0 <> -((((0 <> t1.c1) xor <cache>((0 <> 0.5609188265100435))) >= <cache>((0.25130445811584556 <> 1342228223))))) (cost=0.75 rows=2) -> Covering index skip scan for grouping on t1 using i1 (cost=0.75 rows=2) How to repeat: CREATE TABLE t1(c0 DECIMAL , c1 FLOAT PRIMARY KEY UNIQUE KEY) ; INSERT INTO t1(c0, c1) VALUES(NULL, 0.41472381328161134); REPLACE INTO t1(c0, c1) VALUES(-1.689413262E9, 1037036926); INSERT IGNORE INTO t1(c0, c1) VALUES(-547152157, 0.8763334995483548); UPDATE t1 SET c0=DEFAULT; UPDATE t1 SET c1=NULL WHERE (- (((-482097596) IN (NULL, '', -1312523974)) AND (NULL))); INSERT IGNORE INTO t1(c0) VALUES(NULL), (NULL), ('-1123800899'); CREATE INDEX i1 USING BTREE ON t1(c0, c1 DESC) ALGORITHM INPLACE; -- cardinality: 1: NULL SELECT DISTINCT CONCAT(t1.c0) FROM t1 WHERE (- (((t1.c1) XOR (0.5609188265100435)) >= ((0.25130445811584556) != (1342228223)))) GROUP BY t1.c0 HAVING (MAX(DISTINCT t1.c1)) != (-1599144071); -- cardinality: 0 SELECT DISTINCT ref0 FROM (SELECT CONCAT(t1.c0) AS ref0, ((MAX(DISTINCT t1.c1)) != (-1599144071)) AS ref1 FROM t1 WHERE (- (((t1.c1) XOR (0.5609188265100435)) >= ((0.25130445811584556) != (1342228223)))) GROUP BY t1.c0) AS s WHERE ref1; Suggested fix: Possible Cause The issue seems related to optimizer rewriting of the relocated aggregate predicate after derived-table materialization. In the original query, the aggregate predicate remains a post-group filter. In the rewritten query, WHERE ref1 is transformed into a truth test over the derived boolean expression, and this transformation appears to change the result incorrectly. MySQL has had optimizer issues around XOR-related rewrites before, so this may be another wrong-result case triggered by expression simplification plus grouping/index access.