Bug #118541 Inconsistent Results Between NATURAL JOIN and its Equivalent Form Using EXCEPT ALL with Implicit Type Casting
Submitted: 27 Jun 8:31 Modified: 27 Jun 13:35
Reporter: 策 吕 Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:9.2.0 , 8.0.42 OS:Linux
Assigned to: CPU Architecture:Any

[27 Jun 8:31] 策 吕
Description:
We have identified a severe inconsistency in query results between a NATURAL JOIN and its logically equivalent representation using EXCEPT ALL when implicit type casting occurs during the join. This suggests a potential bug in the query execution engine, particularly in how set operators (EXCEPT ALL) handle derived values from joins.

Steps to Reproduce:
    1. Create tables with different data types for the join key:
-- Table t0 with a VARCHAR column
CREATE TABLE t0 (c0 VARCHAR(500) NOT NULL);
INSERT INTO t0 VALUES ('0.6952750358145634'), ('xgC1');

-- Table t1 with a DOUBLE column
CREATE TABLE t1 (c0 DOUBLE NOT NULL PRIMARY KEY);
INSERT INTO t1 VALUES (0);

    2. Execute Query 1 (Standard NATURAL JOIN):
SELECT DISTINCTROW t1.c0 AS ref0, t0.c0 AS ref1 
FROM t0 NATURAL JOIN t1 
GROUP BY t1.c0, t0.c0;

    3. Execute Query 2 (Logically equivalent form using EXCEPT ALL):
SELECT DISTINCTROW c0 AS ref0, c0 AS ref1 
FROM (
    SELECT * FROM (
        SELECT * FROM t0 NATURAL LEFT JOIN t1
    ) AS LEFT_ALL
    EXCEPT ALL
    SELECT * FROM (
        (SELECT * FROM t0 NATURAL LEFT JOIN t1) AS LEFT_ALL_INNER
        EXCEPT ALL
        (SELECT * FROM t0 NATURAL RIGHT JOIN t1) AS RIGHT_ALL_INNER
    ) AS LEFT_ONLY
) AS T_INNER
GROUP BY c0, c0;

Query 1 Output:
+------+------+
| ref0 | ref1 |
+------+------+
|    0 | xgC1 |
+------+------+
1 row in set

Query 2 Output:
Empty set

The two queries should be logically equivalent. The structure A - (A - B) is a standard relational algebra method to compute the intersection A ∩ B. While we acknowledge MySQL's special behavior in SELECT after a JOIN (as seen in Query 1), the EXCEPT ALL structure (Query 2) should also correctly simulate the INNER JOIN and produce a consistent result set.
Based on a step-by-step evaluation, Query 2 is expected to produce one row: (0.0, 0.0). The fact that it returns an empty set indicates a likely flaw in the execution of EXCEPT ALL when operating on derived, mixed-type data from preceding joins.
Regardless of the specific output, the two queries, which aim to compute the same logical join, should not produce such dramatically different results (one row vs. empty set).

How to repeat:
mysql> status;
--------------
mysql  Ver 9.2.0 for Linux on x86_64 (MySQL Community Server - GPL)

Connection id:          1191
Current database:       
Current user:           root@localhost
SSL:                    Not in use
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server version:         9.2.0 MySQL Community Server - GPL
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    utf8mb4
Db     characterset:    utf8mb4
Client characterset:    latin1
Conn.  characterset:    latin1
UNIX socket:            /var/run/mysqld/mysqld.sock
Binary data as:         Hexadecimal
Uptime:                 6 hours 53 min 21 sec

SQLcode:
------------------------------------------------------------------------------
DROP DATABASE IF EXISTS database98;
CREATE DATABASE database98;
USE database98;
CREATE TABLE IF NOT EXISTS t0(c0 VARCHAR(500)  STORAGE DISK NOT NULL) ;
CREATE TABLE t1(c0 DOUBLE ZEROFILL  COLUMN_FORMAT DEFAULT COMMENT 'asdf'  UNIQUE KEY PRIMARY KEY STORAGE DISK) PACK_KEYS = DEFAULT, ENGINE = HEAP;
DELETE LOW_PRIORITY IGNORE FROM t1;
SET SESSION read_rnd_buffer_size = 1201856744;
SET SESSION preload_buffer_size = 604501183;
CHECKSUM TABLE t0;
INSERT IGNORE INTO t1(c0) VALUES(']M鯢b');
CREATE UNIQUE INDEX i0 ON t1(c0) VISIBLE;

REPLACE DELAYED INTO t0(c0) VALUES(0.6952750358145634);
INSERT DELAYED IGNORE INTO t0(c0) VALUES("xgC1");
SET SESSION join_buffer_size = 4734638815441487969;

select * from t0;
select * from t1;

SELECT DISTINCTROW t1.c0 AS ref0, t0.c0 AS ref1 FROM t0 NATURAL JOIN t1 GROUP BY t1.c0, t0.c0;
SELECT DISTINCTROW c0 AS ref0, c0 AS ref1 FROM ( SELECT * FROM (SELECT * FROM t0 NATURAL LEFT JOIN t1 ) AS LEFT_ALL EXCEPT ALL SELECT * FROM ( SELECT * FROM (SELECT * FROM t0 NATURAL LEFT JOIN t1 ) AS LEFT_ALL_INNER EXCEPT ALL SELECT * FROM (SELECT * FROM t0 NATURAL RIGHT JOIN t1 ) AS RIGHT_ALL_INNER ) AS LEFT_ONLY ) AS T_INNER GROUP BY c0, c0;
[27 Jun 13:35] MySQL Verification Team
Hello 策 吕,

Thank you for the report and feedback.
Verified as described.

regards,
Umesh