Bug #105975 When I set duplicateweedout=ON,I got an incorrect query result.
Submitted: 25 Dec 2021 3:13 Modified: 25 Dec 2021 7:09
Reporter: Felix Li Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.0.22,8.0.27 OS:Linux
Assigned to: CPU Architecture:Any
Tags: duplicateweedout, regression

[25 Dec 2021 3:13] Felix Li
Description:
When duplicateweedout=ON and OFF, the query results are inconsistent.

The content is too long, I write it in 1.txt.

How to repeat:
Import test.sql and query like 1.txt.
[25 Dec 2021 6:54] Felix Li
test data

Attachment: test.sql (application/octet-stream, text), 132.14 KiB.

[25 Dec 2021 6:55] Felix Li
test  steps

Attachment: 1.txt (text/plain), 9.36 KiB.

[25 Dec 2021 7:09] MySQL Verification Team
Hello Felix Li,

Thank you for the report and test case.
Verified as described.

regards,
Umesh
[28 Nov 2023 9:44] Shinya Sugiyama
Just Workaround for this case.

It seems something worng with blob data type.
There are two indexs on C_BLOB. 

If we use generated column instead of index on blob column, we can get same query result.
I picked up "KEY `ndpTest_table22_INDEX3` (`C_BLOB`(255))" because of out put of Explain.

- add generated column and add index on it.

```
mysql> alter table ndpTest_table22 add column D_BLOB varchar(255) generated always as (left(C_BLOB,255)) stored;
Query OK, 60 rows affected (0.21 sec)
Records: 60  Duplicates: 0  Warnings: 0

mysql> alter table ndpTest_table22 add index idx_d_blob(D_BLOB);
Query OK, 0 rows affected (0.08 sec)
Records: 0  Duplicates: 0  Warnings: 0

```

- result

```
mysql> select count(*) from ndpTest_table22 as t1 left join ndpTest_table22 as t2 on t1.C_BLOB=t2.C_BLOB and t1.C_BLOB in (select t2.C_BLOB from ndpTest_table22 as t2);
+----------+
| count(*) |
+----------+
|   106020 |   /*** 1)set optimizer_switch='duplicateweedout=on' ***/
+----------+
1 row in set (0.27 sec)

mysql> select count(*) from ndpTest_table22 as t1 left join ndpTest_table22 as t2 on t1.D_BLOB=t2.D_BLOB and t1.D_BLOB in (select t2.D_BLOB from ndpTest_table22 as t2);
+----------+
| count(*) |
+----------+
|     2378 |   /*** 2)set optimizer_switch='duplicateweedout=on' ***/
+----------+
1 row in set (0.00 sec)

mysql> set optimizer_switch='duplicateweedout=off';
Query OK, 0 rows affected (0.00 sec)

mysql> select count(*) from ndpTest_table22 as t1 left join ndpTest_table22 as t2 on t1.D_BLOB=t2.D_BLOB and t1.D_BLOB in (select t2.D_BLOB from ndpTest_table22 as t2);
+----------+
| count(*) |
+----------+
|     2378 |   /*** 3)set optimizer_switch='duplicateweedout=off' ***/
+----------+
1 row in set (0.01 sec)

mysql> select count(*) from ndpTest_table22 as t1 left join ndpTest_table22 as t2 on t1.C_BLOB=t2.C_BLOB and t1.C_BLOB in (select t2.C_BLOB from ndpTest_table22 as t2);
+----------+
| count(*) |
+----------+
|     2378 |   /*** 4)set optimizer_switch='duplicateweedout=off' ***/
+----------+
1 row in set (0.01 sec)

```