Bug #110156 Incorrect query results caused by subqueries and JOIN
Submitted: 21 Feb 2023 14:24 Modified: 21 Feb 2023 17:42
Reporter: Zuming Jiang Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[21 Feb 2023 14:24] Zuming Jiang
Description:
My fuzzer found a bug in the MySQL server, which incorrectly fetch the rows.

How to repeat:
** Setup the environment **

/usr/local/mysql/bin/mysqld_safe &
/usr/local/mysql/bin/mysql -uroot
mysql> create database testdb;
mysql> \q
/usr/local/mysql/bin/mysql -uroot -Dtestdb < sim_bk.sql (sim_bk.sql is in the attached)

** Test case 1**
/usr/local/mysql/bin/mysql -u root -D testdb
mysq> 
select
  ref_0.wkey as c0,
  ref_0.pkey as c1,
  ref_0.c11093 as c2,
  ref_0.c11094 as c3
from
  t2260 as ref_0
where exists (
  select *
    from
      ((t2259 as ref_37
          cross join t2259 as ref_38)
        left outer join (select
              ref_39.wkey as c0,
              ref_0.wkey as c3
            from
              t2260 as ref_39
            where ref_39.wkey is NULL) as subq_0
        on (ref_38.c11086 = subq_0.c0 ))
    where subq_0.c3 is not NULL) is TRUE;

The MySQL server returns one row:
---
15300   83965000        t7      76.38
---

Then, I change the WHERE clause to "exist(...) is FALSE", as following:

** Test case 2**
/usr/local/mysql/bin/mysql -u root -D testdb
mysq> 
select
  ref_0.wkey as c0,
  ref_0.pkey as c1,
  ref_0.c11093 as c2,
  ref_0.c11094 as c3
from
  t2260 as ref_0
where exists (
  select *
    from
      ((t2259 as ref_37
          cross join t2259 as ref_38)
        left outer join (select
              ref_39.wkey as c0,
              ref_0.wkey as c3
            from
              t2260 as ref_39
            where ref_39.wkey is NULL) as subq_0
        on (ref_38.c11086 = subq_0.c0 ))
    where subq_0.c3 is not NULL) is FALSE;

Test case 2 should not return the row returned by test case 1. However, the MySQL server still returns one row:
---
15300   83965000        t7      76.38
---

Therefore, it triggers a logical bug in MySQL
[21 Feb 2023 14:25] Zuming Jiang
sim_bk.sql

Attachment: sim_bk.sql (application/octet-stream, text), 2.72 KiB.

[21 Feb 2023 15:59] MySQL Verification Team
Hi Mr. Jiang,

Thank you for your bug report.

However, this is not a bug.

Your query is written with very opaque logic, but this is irrelevant, since your query essentially looks like this: 

select   ref_0.wkey as c0,   ref_0.pkey as c1,   ref_0.c11093 as c2,   ref_0.c11094 as c3 from   t2260 as ref_0 where exists ( ... ) is TRUE (or FALSE;

There is no object defined that is tested for TRUE or FALSE. Everything that is written in the EXISTS clause is irrelevant, since that clause has no lifetime after the last closing parenthesis.

Not a bug.
[21 Feb 2023 16:03] MySQL Verification Team
Actually, as our Manual writes clearly, the entire EXISTS () i rewritten as an anti-join.
[21 Feb 2023 16:06] Zuming Jiang
You could try these two, they also return the same results:

select   ref_0.wkey as c0,   ref_0.pkey as c1,   ref_0.c11093 as c2,   ref_0.c11094 as c3 from   t2260 as ref_0 where exists ( ... );

select   ref_0.wkey as c0,   ref_0.pkey as c1,   ref_0.c11093 as c2,   ref_0.c11094 as c3 from   t2260 as ref_0 where not exists ( ... );

I use "is TRUE" or "is FALSE" just to check whether the EXISTS clause returns TRUE or FALSE.
[21 Feb 2023 16:09] MySQL Verification Team
Furthermore, read in our Manual how should EXISTS nested queries be written. Something like this:

SELECT col1, col2  FROM t1   WHERE EXISTS  (SELECT * FROM t1 WHERE t2.col1  = t1.col1);

Hence, it should contain the info that would make EXISTS behave like IN nested query.
[21 Feb 2023 16:21] Zuming Jiang
Thanks for your response!

Yes, we typically use exists clause that contains the necessary info. But I think we should properly handle the corner cases. 

These two queries:

select   ref_0.wkey as c0,   ref_0.pkey as c1,   ref_0.c11093 as c2,   ref_0.c11094 as c3 from   t2260 as ref_0 where exists ( ... );

select   ref_0.wkey as c0,   ref_0.pkey as c1,   ref_0.c11093 as c2,   ref_0.c11094 as c3 from   t2260 as ref_0 where not exists ( ... );

Both return the same one row. However, these two queries are contradictory ("exists" and "not exists"), so they should not return the same results. Therefore, I think it might trigger a bug in MySQL.
[21 Feb 2023 17:42] MySQL Verification Team
Hi Mr. Jiang,

Actually, you are correct.

Ending the query with  IS TRUE is implicit in a WHERE clause, so it can be omitted. IS FALSE effectively converts the EXISTS clause to a NOT EXISTS.

Hence, what you reported is a bug.

Verified as reported.
[22 Feb 2023 15:17] MySQL Verification Team
proper category ......

also, checked on three servers, so a detail for server is unnecessary ........