Bug #114435 | Incorrect query results caused by the subquery optimization. | ||
---|---|---|---|
Submitted: | 21 Mar 6:56 | Modified: | 24 Mar 7:48 |
Reporter: | John Jove | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S1 (Critical) |
Version: | 8.0.36 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[21 Mar 6:56]
John Jove
[21 Mar 7:34]
MySQL Verification Team
Hello John, Thank you for the report and test case. Verified as described. regards, Umesh
[22 Mar 7:17]
Roy Lyseng
Posted by developer: Not a bug. The optimizer sees that the predicate (1 IS UNKNOWN) is false, hence it does not have to evaluate the second part of the AND, and thus it will never see that the subquery evaluation should lead to an error.
[24 Mar 7:48]
John Jove
Thanks for the detailed explanation. However, such evaluation order seems not to be deterministic. It means that the error part of AND clause can be evaluated first sometimes despite the FALSE operand, and an error is retuned. I construct two interesting test cases as follows. DROP TABLE IF EXISTS t1; CREATE TABLE t1 (c1 INT); INSERT INTO t1 VALUES (1); -- only one row SELECT c1 FROM t1 WHERE ((SELECT c1 FROM t1) NOT BETWEEN 1 AND 2) NOT IN (SELECT c1 FROM t1); -- {1} DROP TABLE IF EXISTS t1; CREATE TABLE t1 (c1 INT); INSERT INTO t1 VALUES (1),(2); -- more than one row SELECT c1 FROM t1 WHERE ((SELECT c1 FROM t1) NOT BETWEEN 1 AND 2) NOT IN (SELECT c1 FROM t1); -- [21000][1242] Subquery returns more than 1 row From the above results, this kind of error seems to be checked in the runtime, not statically or grammar-level. Is it correct?
[3 Apr 6:18]
Roy Lyseng
Since the execution plan is laid out by the optimizer, it is quite true that whether to issue an error or not may seem non-deterministic. It is entirely up to the optimizer to determine whether a sub-plan needs to be executed or not, and in which order - thus the term "optimizer". And the execution is also obviously dependent on the data in the table - one (or zero) rows mean success, more than one row means failure. Notice also that reporting an error here may be taken as an evidence of bad database design, or unwashed data. Usually a normalized database will use a unique index or similar to ensure that max. one row qualifies to the scalar subquery. Or it will be a set or rows that is aggregated into one row.