Bug #111681 EXISTS query results in duplicate rows
Submitted: 6 Jul 2023 13:47 Modified: 12 Jul 2023 12:11
Reporter: Xianghai Liu Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0 OS:Linux
Assigned to: CPU Architecture:Any
Tags: exists

[6 Jul 2023 13:47] Xianghai Liu
Description:
Using EXISTS as a filter condition may result in duplicate rows

How to repeat:
according to this SQL script:
```
-- auto-generated definition
create table group_info
(
    id         int primary key,
    group_id   bigint       not null,
    group_type varchar(10)  not null,
    source     varchar(255) null
);
INSERT INTO group_info (id, group_id, group_type, source) VALUES (1, 1, '1', 'aaaa');
INSERT INTO group_info (id, group_id, group_type, source) VALUES (2, 1, '1', 'bbbb');
INSERT INTO group_info (id, group_id, group_type, source) VALUES (3, 1, '1', 'cccc');

-- auto-generated definition
create table asset_info
(
    id         int primary key,
    group_id   bigint      not null,
    asset_name varchar(64) not null
);

INSERT INTO asset_info (id, group_id, asset_name) VALUES (1, 1, 'asset1');
INSERT INTO asset_info (id, group_id, asset_name) VALUES (2, 1, 'asset2');
INSERT INTO asset_info (id, group_id, asset_name) VALUES (3, 1, 'asset3');

-- auto-generated definition
create table key_word
(
    id       int primary key,
    group_id bigint      not null,
    key_word varchar(32) not null
);
INSERT INTO key_word (id, group_id, key_word) VALUES (2, 1, 'key1');
INSERT INTO key_word (id, group_id, key_word) VALUES (3, 1, 'key2');

-- auto-generated definition
create table dictionary
(
    id         int primary key,
    group_code varchar(10) null,
    vc_code    varchar(10) null,
    vc_value   varchar(64) null
);
INSERT INTO dictionary (id, group_code, vc_code, vc_value) VALUES (1, 'union_kind', '1', 'type1');
INSERT INTO dictionary (id, group_code, vc_code, vc_value) VALUES (2, 'union_kind', '2', 'type2');
INSERT INTO dictionary (id, group_code, vc_code, vc_value) VALUES (3, 'union_kind', '3', 'type3');
```

and execute the query:
```
SELECT gigt.*
FROM (SELECT DISTINCT group_id, group_type FROM group_info) AS gigt
         LEFT JOIN dictionary d ON gigt.group_type = d.vc_code
    AND d.group_code = 'union_kind'
WHERE EXISTS (SELECT 1
              FROM asset_info ai
              WHERE ai.group_id = gigt.group_id
                AND ai.asset_name LIKE concat('%', '', '%'))
  AND EXISTS (SELECT 1
              FROM group_info gi
              WHERE gi.group_id = gigt.group_id
                AND gi.source LIKE concat('%', '', '%'))
  AND EXISTS (SELECT 1
              FROM key_word kw
              WHERE kw.group_id = gigt.group_id
                AND kw.key_word LIKE concat('%', '', '%'));
```
I should look the result:
```
| group_id | group_type |
| -------- | ---------- |
| 1        | 1          |
```
In fact, I see the following results:
```
| group_id | group_type |
| -------- | ---------- |
| 1        | 1          |
| 1        | 1          |
| 1        | 1          |
| 1        | 1          |
| 1        | 1          |
| 1        | 1          |
```
[6 Jul 2023 14:17] MySQL Verification Team
HI Mr. Liu,

Thank you for your bug report.

However, this is not a bug.

There is nothing in the SQL standard that enforces distinct results.

Hence, you have to do it explicitly, like:

SELECT DISTINCT gigt.* .....................

Not a bug.
[12 Jul 2023 12:11] MySQL Verification Team
Hi Mr. Liu,

We can now confirm that this is a bug.

However, it is a bit different from what you reported.

The bug lies in the fact that adding filters increases the result set.

This is now a verified bug report.
[20 Oct 2023 15:08] Sean Ross
Any update on this?

Experienced today when using NOT EXISTS and EXISTS in the same query. Not sure if it has anything to do with both foreign keys matching exactly the main table key? Our tables were built decades ago so don't follow conventional rules therefore our clients table has the key clientID, I am using NOT EXISTS on a removals table with the foreign key clientID and then checking EXISTS on a policies table which also has a foreign key clientID. All have the table name before the key name however it seems to be causing confusion.

By removing either of the NOT EXISTS or EXISTS filter the result set is as expected without duplicates
[23 Oct 2023 10:29] MySQL Verification Team
Hi Mr. Liu,

No, there are no news regarding this bug.

Every team in Development has it's own scheduling of bug fixing and those change very frequently. Hence, nobody knows when will this be addressed.

Regarding EXISTS and NOT EXISTS we have added that comment to our internal bugs DB.