Description:
I've noticed incorrect results being returned when using dependent subqueries in MySQL 8.0.39. This may be applicable to 8.0.x, but I have not gone to that extent in testing. I couldn't reproduce the bug in MySQL 8.4. It may be possible to simplify the repeat process, but I haven't had the time to tinker with it. If you do, please share your findings.
How to repeat:
In your shell
```
docker run --name bug_mysql -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -d mysql:8.0.39
sleep 5
docker exec -it bug_mysql mysql -uroot
```
create the database and some tables
```
CREATE DATABASE bug_db_2;
USE bug_db_2;
CREATE TABLE bug_table_0 (id INT PRIMARY KEY);
CREATE TABLE bug_table_1 (
id INT PRIMARY KEY,
bug_table_0_id INT,
FOREIGN KEY (bug_table_0_id) REFERENCES bug_table_0(id)
);
CREATE TABLE bug_table_2 (
id INT PRIMARY KEY,
bug_table_1_id_1 INT,
bug_table_1_id_2 INT,
FOREIGN KEY (bug_table_1_id_1) REFERENCES bug_table_1(id),
FOREIGN KEY (bug_table_1_id_2) REFERENCES bug_table_1(id)
);
CREATE TABLE bug_table_3 (id INT PRIMARY KEY);
CREATE TABLE bug_table_4 (
id INT PRIMARY KEY,
bug_table_3_id INT,
bug_table_2_id INT,
value VARCHAR(255),
FOREIGN KEY (bug_table_2_id) REFERENCES bug_table_2(id),
FOREIGN KEY (bug_table_3_id) REFERENCES bug_table_3(id)
);
CREATE TABLE bug_table_5 (
id INT PRIMARY KEY,
bug_table_3_id INT,
bug_table_2_id INT,
value_id INT,
FOREIGN KEY (bug_table_2_id) REFERENCES bug_table_2(id),
FOREIGN KEY (bug_table_3_id) REFERENCES bug_table_3(id)
);
```
insert some data
```
INSERT INTO bug_table_0 (id)
VALUES (1);
INSERT INTO bug_table_1 (id, bug_table_0_id)
VALUES (1, 1),
(2, 1);
INSERT INTO bug_table_2 (id, bug_table_1_id_1, bug_table_1_id_2)
VALUES (1, 1, 2),
(2, 2, 1);
INSERT INTO bug_table_3 (id)
VALUES (1),
(2);
INSERT INTO bug_table_4 (id, bug_table_3_id, bug_table_2_id, value)
VALUES (1, 1, 1, 'test'),
(2, 2, 1, 'test'),
(3, 1, 2, 'blue'),
(4, 2, 2, 'blue');
INSERT INTO bug_table_5 (id, bug_table_3_id, bug_table_2_id, value_id)
VALUES (1, 1, 1, 1),
(2, 2, 1, 2),
(3, 1, 2, 1),
(4, 2, 2, 2);
```
At this point exit the mysql client (COMMAND + C or exit;) and reconnect
```
docker exec -it bug_mysql mysql -uroot -D bug_db_2
```
Run this query which should output 2 but instead outputs 0
this query in particular can be fixed by disabling semijoins
```
-- The query will return correct results if you do `SET SESSION optimizer_switch='semijoin=off';`
SELECT count(res.id)
FROM (
SELECT DISTINCT tt2.bug_table_1_id_2 as `id`
from bug_table_2 tt2
WHERE (
SELECT (
SELECT 1
from bug_table_4 tt3
WHERE tt3.bug_table_2_id = tt2.id
AND tt3.value IN (
SELECT value
from bug_table_4 tt3
WHERE tt3.bug_table_2_id = tt2.id
AND tt3.bug_table_3_id = 2
)
AND tt3.bug_table_3_id = 1
) IS NOT NULL
)
AND tt2.bug_table_1_id_1 IN (
SELECT tt1.id
FROM bug_table_1 tt1
WHERE tt1.bug_table_0_id IN (1)
)
LIMIT 3
) as res;
```
Again, get a new session by exiting and reconnecting
```
docker exec -it bug_mysql mysql -uroot -D bug_db_2
```
The below query should also output 2 isn't fixable by disabling semijoins
```
-- This query won't give correct results even with the semijoin optimization turned off
-- resembles count of text attrs where text attr = another text attr, expected result 2
SELECT count(res.value)
FROM (
SELECT DISTINCT tt4.value as `value`
from bug_table_4 tt4
JOIN bug_table_2 tt2 ON tt4.bug_table_2_id = tt2.id
WHERE (
SELECT (
SELECT 1
from bug_table_4 tt4
WHERE tt4.bug_table_2_id = tt2.id
AND tt4.value IN (
SELECT value
from bug_table_4 tt4
WHERE tt4.bug_table_2_id = tt2.id
AND tt4.bug_table_3_id = 2
)
AND tt4.bug_table_3_id = 1
) IS NOT NULL
)
AND tt2.bug_table_1_id_1 IN (
SELECT tt1.id
FROM bug_table_1 tt1
WHERE tt1.bug_table_0_id IN (1)
)
AND tt4.bug_table_3_id = 1
LIMIT 3
) as res;
```
To clean up, exit mysql client and run
```
docker stop bug_mysql && docker rm bug_mysql
```
Description: I've noticed incorrect results being returned when using dependent subqueries in MySQL 8.0.39. This may be applicable to 8.0.x, but I have not gone to that extent in testing. I couldn't reproduce the bug in MySQL 8.4. It may be possible to simplify the repeat process, but I haven't had the time to tinker with it. If you do, please share your findings. How to repeat: In your shell ``` docker run --name bug_mysql -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -d mysql:8.0.39 sleep 5 docker exec -it bug_mysql mysql -uroot ``` create the database and some tables ``` CREATE DATABASE bug_db_2; USE bug_db_2; CREATE TABLE bug_table_0 (id INT PRIMARY KEY); CREATE TABLE bug_table_1 ( id INT PRIMARY KEY, bug_table_0_id INT, FOREIGN KEY (bug_table_0_id) REFERENCES bug_table_0(id) ); CREATE TABLE bug_table_2 ( id INT PRIMARY KEY, bug_table_1_id_1 INT, bug_table_1_id_2 INT, FOREIGN KEY (bug_table_1_id_1) REFERENCES bug_table_1(id), FOREIGN KEY (bug_table_1_id_2) REFERENCES bug_table_1(id) ); CREATE TABLE bug_table_3 (id INT PRIMARY KEY); CREATE TABLE bug_table_4 ( id INT PRIMARY KEY, bug_table_3_id INT, bug_table_2_id INT, value VARCHAR(255), FOREIGN KEY (bug_table_2_id) REFERENCES bug_table_2(id), FOREIGN KEY (bug_table_3_id) REFERENCES bug_table_3(id) ); CREATE TABLE bug_table_5 ( id INT PRIMARY KEY, bug_table_3_id INT, bug_table_2_id INT, value_id INT, FOREIGN KEY (bug_table_2_id) REFERENCES bug_table_2(id), FOREIGN KEY (bug_table_3_id) REFERENCES bug_table_3(id) ); ``` insert some data ``` INSERT INTO bug_table_0 (id) VALUES (1); INSERT INTO bug_table_1 (id, bug_table_0_id) VALUES (1, 1), (2, 1); INSERT INTO bug_table_2 (id, bug_table_1_id_1, bug_table_1_id_2) VALUES (1, 1, 2), (2, 2, 1); INSERT INTO bug_table_3 (id) VALUES (1), (2); INSERT INTO bug_table_4 (id, bug_table_3_id, bug_table_2_id, value) VALUES (1, 1, 1, 'test'), (2, 2, 1, 'test'), (3, 1, 2, 'blue'), (4, 2, 2, 'blue'); INSERT INTO bug_table_5 (id, bug_table_3_id, bug_table_2_id, value_id) VALUES (1, 1, 1, 1), (2, 2, 1, 2), (3, 1, 2, 1), (4, 2, 2, 2); ``` At this point exit the mysql client (COMMAND + C or exit;) and reconnect ``` docker exec -it bug_mysql mysql -uroot -D bug_db_2 ``` Run this query which should output 2 but instead outputs 0 this query in particular can be fixed by disabling semijoins ``` -- The query will return correct results if you do `SET SESSION optimizer_switch='semijoin=off';` SELECT count(res.id) FROM ( SELECT DISTINCT tt2.bug_table_1_id_2 as `id` from bug_table_2 tt2 WHERE ( SELECT ( SELECT 1 from bug_table_4 tt3 WHERE tt3.bug_table_2_id = tt2.id AND tt3.value IN ( SELECT value from bug_table_4 tt3 WHERE tt3.bug_table_2_id = tt2.id AND tt3.bug_table_3_id = 2 ) AND tt3.bug_table_3_id = 1 ) IS NOT NULL ) AND tt2.bug_table_1_id_1 IN ( SELECT tt1.id FROM bug_table_1 tt1 WHERE tt1.bug_table_0_id IN (1) ) LIMIT 3 ) as res; ``` Again, get a new session by exiting and reconnecting ``` docker exec -it bug_mysql mysql -uroot -D bug_db_2 ``` The below query should also output 2 isn't fixable by disabling semijoins ``` -- This query won't give correct results even with the semijoin optimization turned off -- resembles count of text attrs where text attr = another text attr, expected result 2 SELECT count(res.value) FROM ( SELECT DISTINCT tt4.value as `value` from bug_table_4 tt4 JOIN bug_table_2 tt2 ON tt4.bug_table_2_id = tt2.id WHERE ( SELECT ( SELECT 1 from bug_table_4 tt4 WHERE tt4.bug_table_2_id = tt2.id AND tt4.value IN ( SELECT value from bug_table_4 tt4 WHERE tt4.bug_table_2_id = tt2.id AND tt4.bug_table_3_id = 2 ) AND tt4.bug_table_3_id = 1 ) IS NOT NULL ) AND tt2.bug_table_1_id_1 IN ( SELECT tt1.id FROM bug_table_1 tt1 WHERE tt1.bug_table_0_id IN (1) ) AND tt4.bug_table_3_id = 1 LIMIT 3 ) as res; ``` To clean up, exit mysql client and run ``` docker stop bug_mysql && docker rm bug_mysql ```