Bug #118540 Inconsistent results between NATURAL JOIN and INNER JOIN simulated via LEFT JOIN, RIGHT JOIN, and EXCEPT ALL with NULL v
Submitted: 27 Jun 7:27 Modified: 27 Jun 13:35
Reporter: 策 吕 Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:9.2.0 , 8.0.42 OS:Linux
Assigned to: CPU Architecture:Any

[27 Jun 7:27] 策 吕
Description:
We observed inconsistent results when using the following SQL equivalence transformation, likely caused by how the database handles NULL rows in EXCEPT ALL operations:
-- Original query using NATURAL JOIN (inner join)
SELECT DISTINCT t0.c0 AS ref0 FROM t1 NATURAL JOIN t0;

-- Equivalent query simulating inner join using LEFT JOIN, RIGHT JOIN and EXCEPT ALL
SELECT DISTINCT c0 AS ref0
FROM (
  SELECT *
  FROM (
    SELECT *
    FROM t1 NATURAL LEFT JOIN t0
  ) AS LEFT_ALL
  EXCEPT ALL
  SELECT *
  FROM (
    SELECT *
    FROM (
      SELECT *
      FROM t1 NATURAL LEFT JOIN t0
    ) AS LEFT_ALL_INNER
    EXCEPT ALL
    SELECT *
    FROM (
      SELECT *
      FROM t1 NATURAL RIGHT JOIN t0
    ) AS RIGHT_ALL_INNER
  ) AS LEFT_ONLY
) AS T_INNER;

The test data contains multiple NULLs in the join columns of tables t0 and t1. According to SQL standard, NULL is not equal to any value, including itself, so the inner join does not match NULL = NULL rows and returns an empty set in the first query.

However, due to special handling of NULL rows in EXCEPT ALL, the second query produces results containing NULL rows, causing inconsistent results between the two queries.

We expect the database to correctly handle NULL rows in EXCEPT ALL so that this set-based equivalence expression matches the behavior of the inner join.

How to repeat:
mysql> status;
--------------
mysql  Ver 9.2.0 for Linux on x86_64 (MySQL Community Server - GPL)

Connection id:          1191
Current database:       
Current user:           root@localhost
SSL:                    Not in use
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server version:         9.2.0 MySQL Community Server - GPL
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    utf8mb4
Db     characterset:    utf8mb4
Client characterset:    latin1
Conn.  characterset:    latin1
UNIX socket:            /var/run/mysqld/mysqld.sock
Binary data as:         Hexadecimal
Uptime:                 6 hours 53 min 21 sec

SQL Code:
-----------------------------------------------------------------------------
DROP DATABASE IF EXISTS database24;
CREATE DATABASE database24;
USE database24;
CREATE TABLE IF NOT EXISTS t0(c0 DECIMAL ZEROFILL ) ;
CREATE TABLE IF NOT EXISTS t1(c0 MEDIUMTEXT  NULL COMMENT 'asdf'  COLUMN_FORMAT DYNAMIC) ;
DELETE LOW_PRIORITY QUICK IGNORE FROM t1 WHERE t1.c0;
INSERT INTO t1(c0) VALUES(-1919479106);
REPLACE LOW_PRIORITY INTO t1(c0) VALUES(NULL);
CHECK TABLE t1 FOR UPGRADE MEDIUM QUICK EXTENDED FAST CHANGED;
UPDATE t1 SET c0=-1919479106;
UPDATE t1 SET c0=DEFAULT;
CREATE UNIQUE INDEX i0 USING BTREE ON t0((IF(CAST(-729819143 AS SIGNED), t0.c0, (0.7178347989350793) LIKE (0.3538771914017127))));
ALTER TABLE t0 PACK_KEYS 1, COMPRESSION 'NONE', RENAME t0, DELAY_KEY_WRITE 1, CHECKSUM 1, STATS_PERSISTENT 0, ROW_FORMAT FIXED, STATS_AUTO_RECALC DEFAULT, FORCE, INSERT_METHOD LAST;

ALTER TABLE t1 STATS_PERSISTENT 0, STATS_AUTO_RECALC DEFAULT;
DELETE FROM t0;
select TABLE_NAME, ENGINE from information_schema.TABLES where table_schema = 'database24';
UPDATE t1 SET c0=CAST((t1.c0) && (t1.c0) AS SIGNED) WHERE t1.c0;
DELETE FROM t0 WHERE t0.c0;
UPDATE t1 SET c0=DEFAULT;

REPLACE DELAYED INTO t0(c0) VALUES(NULL);
SET SESSION rbr_exec_mode = IDEMPOTENT;
UPDATE t0 SET c0=(- ( EXISTS (SELECT 1))) WHERE ( EXISTS (SELECT 1)) IS FALSE;
ANALYZE  TABLE t0, t1;

REPLACE INTO t0(c0) VALUES("0.7178347989350793"), (NULL), (NULL);

# firstQuery
SELECT DISTINCT t0.c0 AS ref0 FROM t1 NATURAL JOIN t0;

#secondQuery
SELECT DISTINCT c0 AS ref0
FROM (
  SELECT *
    FROM (
        SELECT *
        FROM t1 NATURAL LEFT JOIN t0
    ) AS LEFT_ALL
  EXCEPT ALL
  SELECT *
    FROM (
        SELECT *
        FROM (
        SELECT *
        FROM t1 NATURAL LEFT JOIN t0
        ) AS LEFT_ALL_INNER
    EXCEPT ALL
    SELECT *
    FROM (
        SELECT *
        FROM t1 NATURAL RIGHT JOIN t0
        ) AS RIGHT_ALL_INNER
    ) AS LEFT_ONLY
) AS T_INNER
[27 Jun 7:46] 策 吕
In purely mathematical set theory, an element either belongs to a set or it does not.There are no "unknown" states.
In this system, A ∩ B and A - (A - B) are completely equivalent.
[27 Jun 13:35] MySQL Verification Team
Hello 策 吕,

Thank you for the report and feedback.
Verified as described.

regards,
Umesh