Bug #113059 Inconsistent results when using DUPSWEEDOUT
Submitted: 13 Nov 2023 8:29 Modified: 13 Nov 2023 9:49
Reporter: Wang Ke Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.2.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: Incorrect Results

[13 Nov 2023 8:29] Wang Ke
Description:
Hello, MySQL Verification Team, I found a test case which got inconsistent results when switching SIMIJOIN hints:

```
mysql> CREATE TABLE t0 ( c0 SMALLINT ) ;
Query OK, 0 rows affected (0.15 sec)

mysql> INSERT INTO t0 VALUES (NULL), ( 1 ) ;
Query OK, 2 rows affected (0.03 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM ( SELECT 'x' ca2 ) AS ra0 LEFT JOIN t0 ra1 ON ra0.ca2 IN ( SELECT 'x' ca1 FROM t0 AS ra2 ) ;
+-----+------+
| ca2 | c0   |
+-----+------+
| x   | NULL |
| x   |    1 |
+-----+------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM ( SELECT 'x' ca2 ) AS ra0 LEFT JOIN t0 ra1 ON ra0.ca2 IN ( SELECT /*+ SEMIJOIN(DUPSWEEDOUT) */ 'x' ca1 FROM t0 AS ra2 ) ;
+-----+------+
| ca2 | c0   |
+-----+------+
| x   | NULL |
| x   |    1 |
| x   | NULL |
| x   |    1 |
+-----+------+
4 rows in set (0.00 sec)

```

And the query plans are:

```
mysql> EXPLAIN SELECT * FROM ( SELECT 'x' ca2 ) AS ra0 LEFT JOIN t0 ra1 ON ra0.ca2 IN ( SELECT 'x' ca1 FROM t0 AS ra2 ) ;
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+-----------------+
| id | select_type | table      | partitions | type   | possible_keys | key  | key_len | ref  | rows | filtered | Extra           |
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+-----------------+
|  1 | PRIMARY     | <derived2> | NULL       | system | NULL          | NULL | NULL    | NULL |    1 |   100.00 | NULL            |
|  1 | PRIMARY     | ra1        | NULL       | ALL    | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where     |
|  1 | PRIMARY     | ra2        | NULL       | ALL    | NULL          | NULL | NULL    | NULL |    2 |   100.00 | FirstMatch(ra1) |
|  2 | DERIVED     | NULL       | NULL       | NULL   | NULL          | NULL | NULL    | NULL | NULL |     NULL | No tables used  |
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+-----------------+
4 rows in set, 1 warning (0.00 sec)

mysql> EXPLAIN SELECT * FROM ( SELECT 'x' ca2 ) AS ra0 LEFT JOIN t0 ra1 ON ra0.ca2 IN ( SELECT /*+ SEMIJOIN(DUPSWEEDOUT) */ 'x' ca1 FROM t0 AS ra2 ) ;
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+---------------------------------------------+
| id | select_type | table      | partitions | type   | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                       |
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+---------------------------------------------+
|  1 | PRIMARY     | <derived2> | NULL       | system | NULL          | NULL | NULL    | NULL |    1 |   100.00 | NULL                                        |
|  1 | PRIMARY     | ra2        | NULL       | ALL    | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where; Start temporary; End temporary |
|  1 | PRIMARY     | ra1        | NULL       | ALL    | NULL          | NULL | NULL    | NULL |    2 |   100.00 | NULL                                        |
|  2 | DERIVED     | NULL       | NULL       | NULL   | NULL          | NULL | NULL    | NULL | NULL |     NULL | No tables used                              |
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+---------------------------------------------+
4 rows in set, 1 warning (0.00 sec)

```

Hope to be handled properly, thanks!

How to repeat:
Test case:

```
CREATE TABLE t0 ( c0 SMALLINT ) ;
INSERT INTO t0 VALUES (NULL), ( 1 ) ;
SELECT * FROM ( SELECT 'x' ca2 ) AS ra0 LEFT JOIN t0 ra1 ON ra0.ca2 IN ( SELECT 'x' ca1 FROM t0 AS ra2 ) ; -- 2 rows
SELECT * FROM ( SELECT 'x' ca2 ) AS ra0 LEFT JOIN t0 ra1 ON ra0.ca2 IN ( SELECT /*+ SEMIJOIN(DUPSWEEDOUT) */ 'x' ca1 FROM t0 AS ra2 ) ; -- 4 rows
```
[13 Nov 2023 9:49] MySQL Verification Team
Hello Ke Wang,

Thank you for the report and test case.

regards,
Umesh