MySQL Bugs: #120107: DISTINCT eliminates non-duplicate rows for JSON

Bug #120107	DISTINCT eliminates non-duplicate rows for JSON_OBJECTAGG results
Submitted:	19 Mar 8:40	Modified:	19 Mar 13:33
Reporter:	Wang Ojiken	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: JSON	Severity:	S3 (Non-critical)
Version:	8.0.45	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
When using JSON_OBJECTAGG together with GROUP BY on a nullable column, adding DISTINCT to the outer SELECT incorrectly removes non-duplicate rows.

Specifically, the query:

SELECT DISTINCT JSON_OBJECTAGG('key', t1.c0)
FROM t1
GROUP BY t1.c0;

returns fewer rows than expected, even though the aggregated JSON results are semantically distinct.

This indicates that MySQL incorrectly treats distinct JSON objects as duplicates during DISTINCT processing, likely due to improper comparison or internal normalization of JSON values.

How to repeat:
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (
    c0 DECIMAL
);

INSERT INTO t1 (c0) VALUES (0);
INSERT INTO t1 (c0) VALUES (NULL);

-- Query 1 (baseline)
SELECT JSON_OBJECTAGG('key', t1.c0)
FROM t1
GROUP BY t1.c0;

-- Query 2 (with DISTINCT)
SELECT DISTINCT JSON_OBJECTAGG('key', t1.c0)
FROM t1
GROUP BY t1.c0;

Expected result:

Query 1:
{"key": null}
{"key": 0}

Query 2 (DISTINCT should not remove any rows):
{"key": null}
{"key": 0}

Actual result:
Query 1:
{"key": null}
{"key": 0}
Query 2:
{"key": null}

Suggested fix:
Ensure that DISTINCT comparison over JSON values correctly distinguishes semantically different JSON objects, especially when values differ (e.g., null vs 0).

The issue may stem from:
	•	Improper JSON comparison logic
	•	Incorrect hashing or canonicalization of JSON values
	•	Or unintended interaction between DISTINCT and GROUP BY during optimization

A fix should ensure that JSON values are compared based on their full semantic content rather than simplified or lossy internal representations.

Thank you for the bug report.
Verified as described.