Bug #112629 | Index is not used if there is some constant boolean expressions | ||
---|---|---|---|
Submitted: | 5 Oct 2023 8:26 | Modified: | 6 Oct 2023 13:20 |
Reporter: | Sunny Chung | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 8.0.28 | OS: | Any (AWS managed RDS) |
Assigned to: | CPU Architecture: | Any |
[5 Oct 2023 8:26]
Sunny Chung
[5 Oct 2023 12:31]
MySQL Verification Team
Hi Mr. Chung, We must be able to repeat the problem. Hence, we also need a DDL for the CREATE of that table, as well as the rows. You can send us only a fraction of the total rows, but so that the effect is repeatable. Can't repeat. .
[5 Oct 2023 13:11]
MySQL Verification Team
Hi Mr. Chung, We had discussions on your bug in our team and concluded that this is not a bug. The expression: false or rs1.merchant_id forces scanning of all rows, as no index can be used ...... Not a bug.
[6 Oct 2023 3:37]
Sunny Chung
It is obvious that the expression `false or X` could be simplified to `X`, according to Boolean algebra. Hence, the SQL could be simplified to below: ``` select rs1.* from rs_table rs1 where true and true and (false or (rs1.merchant_id in (...))); ``` and then: ``` select rs1.* from rs_table rs1 where rs1.merchant_id in (...); ``` Now, it is obvious that the index `idx__merchant_id__member_number__status__purpose` can be applied. Note that simplifying `true and X` to `X` is already supported by the current version of MySQL, but `false or X` is not. This inconsistency suggests that this is a bug. What is the reasoning that the expression `false or X` forces scanning all the rows?
[6 Oct 2023 10:01]
MySQL Verification Team
Hi, When the expression like: (false OR X) is used in the context like yours, then both have to be evaluated against the conditions. Hence, the full scan because of the false. This is one of the basic concepts of the query resolution. Not a bug.
[6 Oct 2023 13:20]
Sunny Chung
I think I have already made my reasoning very clear, but you rejected without a valid reason. "false" is a constant, and has no relationship with the row data. There is no reason to repeatedly re-evaluate things that do not change against each row. This is also one of the basic concepts when designing algorithms. And, you did not justify why MySQL does not trigger full table scan with (true AND X) but (false OR X). They share similar natures. Also, you did not explain why MySQL cannot simplify the query that human can easily figure out as I demonstrated. It can be done by repeatedly applying basic laws of Boolean algebra. I think implementation exists because they solve problem, not problem exists because the implementation is like that. I took my example to run in another relational database, PostgreSQL, to check if they are also bounded by the "basic concepts of the query resolution" you mentioned. The result is - they did not suffer this issue, and they are able to use the right index with the (false OR X) conditions. ``` Aggregate (cost=1921.53..1921.54 rows=1 width=8) -> Index Only Scan using idx_mid on table1 (cost=0.43..1844.50 rows=30812 width=0) Index Cond: (mid = ANY ('{66,13,45}'::integer[])) ``` I reported this bug because I want MySQL to be better. Now, I feel I took a wrong action.