Bug #119599 Under the same SQL, the results of hash join and nested loop join are inconsistent
Submitted: 27 Dec 2:30 Modified: 29 Dec 4:50
Reporter: Bob Wong Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.41 OS:Any
Assigned to: CPU Architecture:Any

[27 Dec 2:30] Bob Wong
Description:
Under the same SQL, the results of hash join and nested loop join are inconsistent

How to repeat:
CREATE TABLE `t0` (
`c0` varchar(452) DEFAULT NULL
) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

CREATE TABLE `t1` (
`c1` varchar(232) DEFAULT NULL
) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

INSERT INTO `t0` VALUES ('0'),('0');
INSERT INTO `t1` (`c1`) VALUES
(''),
('wUa{r2컀l'),
('-1e500'),
('S-5s'),
('TGw'),
(NULL),
('1735306726'),
('0.3795331728873188'),
('8[0]K%'),
('.GFI'),
('.]3Ltj'),
('/ed5'),
('‌*&JJ_W9'),
('oO'),
('0.607538602344961'),
('-310527442'),
('Zig6'),
('O&1xBW'),
(',#繈*N'),
(']Mf収mO}D'),
('.0#L'),
('O㫸'),
('㔏Cv\\Dm96?'),
('3\'-!㙟'),
('BTO'),
('75521938'),
(NULL),
('0.3230858311384408'),
(''),
('1e500 75521938'),
('75521938');
SELECT t0.c0,SUM(t1.c1) FROM  t0 STRAIGHT_JOIN t1 ON true GROUP BY t0.c0;
c0      SUM(t1.c1)
0       0

explain format=tree SELECT t0.c0,SUM(t1.c1) FROM  t0 STRAIGHT_JOIN t1 ON true GROUP BY t0.c0;
EXPLAIN
-> Table scan on <temporary>
    -> Aggregate using temporary table
        -> Inner hash join (no condition)  (cost=6.9 rows=62)
            -> Table scan on t1  (cost=1.68 rows=31)
            -> Hash
                -> Table scan on t0  (cost=0.45 rows=2)

set optimizer_switch='block_nested_loop=off';
SELECT t0.c0,SUM(t1.c1) FROM  t0 STRAIGHT_JOIN t1 ON true GROUP BY t0.c0;
c0      SUM(t1.c1)
0       75521938

explain format=tree SELECT t0.c0,SUM(t1.c1) FROM  t0 STRAIGHT_JOIN t1 ON true GROUP BY t0.c0;
EXPLAIN
-> Table scan on <temporary>
    -> Aggregate using temporary table
        -> Nested loop inner join  (cost=7.15 rows=62)
            -> Table scan on t0  (cost=0.45 rows=2)
            -> Table scan on t1  (cost=1.8 rows=31)

Suggested fix:
The insertion order and retrieval order of different values under the same key in HashJoinBuffer are reversed. It is recommended to optimize it so that the orders are consistent.
[29 Dec 4:50] Chaithra Marsur Gopala Reddy
Hi Bob Wong,

Thank you for the test case. Verified as described.