Bug #106717 Incorrect nested loop antijoin when using materialization strategy
Submitted: 13 Mar 2022 3:54 Modified: 14 Mar 2022 10:35
Reporter: Xiu Tang Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.0.28, 8.0.27 OS:Any
Assigned to: CPU Architecture:Any
Tags: materialization, nested loop antijoin, semijoin

[13 Mar 2022 3:54] Xiu Tang
Description:
Incorrect nested loop antijoin when using materialization strategy 

How to repeat:
CREATE TABLE `t0` (
  `c0` decimal(10,0) NOT NULL,
  PRIMARY KEY (`c0`),
  UNIQUE KEY `c0` (`c0`)
);
INSERT INTO `t0` VALUES (-538651192),(-207240225),(0),(1968),(1971),(1998),(347101871),(620575215);

MySQL > SELECT t0.c0 AS ref0 FROM t0 WHERE t0.c0 NOT IN (SELECT ALL t0.c0 AS ref1 FROM t0 WHERE (t0.c0 IN (SELECT ALL t0.c0 AS ref2 FROM t0 )) = (t0.c0) );
+------------+
| ref0       |
+------------+
| -538651192 |
| -207240225 |
|          0 |
|       1968 |
|       1971 |
|       1998 |
|  347101871 |
|  620575215 |
+------------+
8 rows in set (0.00 sec)

MySQL > SET optimizer_switch='firstmatch=off';
Query OK, 0 rows affected (0.00 sec)

MySQL > SELECT t0.c0 AS ref0 FROM t0 WHERE t0.c0 NOT IN (SELECT ALL t0.c0 AS ref1 FROM t0 WHERE (t0.c0 IN (SELECT ALL t0.c0 AS ref2 FROM t0 )) = (t0.c0) );
Empty set (0.00 sec)

MySQL > explain format=tree SELECT t0.c0 AS ref0 FROM t0 WHERE t0.c0 NOT IN (SELECT ALL t0.c0 AS ref1 FROM t0 WHERE (t0.c0 IN (SELECT ALL t0.c0 AS ref2 FROM t0 )) = (t0.c0) );
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                                                                                                               |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Nested loop antijoin  (cost=8.25 rows=64)
    -> Index scan on t0 using PRIMARY  (cost=1.05 rows=8)
    -> Single-row index lookup on <subquery2> using <auto_distinct_key> (ref1=t0.c0)
        -> Materialize with deduplication  (cost=1.85..1.85 rows=8)
            -> Filter: (t0.c0 is not null)  (cost=1.05 rows=8)
                -> Index scan on t0 using PRIMARY  (cost=1.05 rows=8)
 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

MySQL > SET optimizer_switch='default';
Query OK, 0 rows affected (0.00 sec)

MySQL > explain format=tree SELECT t0.c0 AS ref0 FROM t0 WHERE t0.c0 NOT IN (SELECT ALL t0.c0 AS ref1 FROM t0 WHERE (t0.c0 IN (SELECT ALL t0.c0 AS ref2 FROM t0 )) = (t0.c0) );
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Nested loop antijoin  (cost=3.85 rows=8)
    -> Index scan on t0 using PRIMARY  (cost=1.05 rows=8)
    -> Filter: (<in_optimizer>(t0.c0,<exists>(select #3)) = t0.c0)  (cost=0.26 rows=1)
        -> Single-row covering index lookup on t0 using PRIMARY (c0=t0.c0)  (cost=0.26 rows=1)
        -> Select #3 (subquery in condition; dependent)
            -> Limit: 1 row(s)
                -> Single-row covering index lookup on t0 using PRIMARY (c0=<cache>(t0.c0))
 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
[14 Mar 2022 10:35] MySQL Verification Team
Hello Xiu Tang,

Thank you for the report and test case.

regards,
Umesh