Bug #17164 | LEFT JOIN with OR clause not optimized correctly | ||
---|---|---|---|
Submitted: | 6 Feb 2006 18:29 | Modified: | 14 Apr 2006 18:39 |
Reporter: | Johan Backlund | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
Version: | 5.0.19-BK, 5.0.18 | OS: | Linux (Linux) |
Assigned to: | Igor Babaev | CPU Architecture: | Any |
[6 Feb 2006 18:29]
Johan Backlund
[6 Feb 2006 18:37]
Johan Backlund
SQL file to run through mysql which creates schema and runs two queries.
Attachment: bug17164.sql (text/x-sql), 720 bytes.
[14 Feb 2006 15:32]
Valeriy Kravchuk
Thank you for a problem report. Verified with your test case on 5.0.19-BK: mysql> CREATE DATABASE bug17164; USE bug17164; Query OK, 1 row affected (0.03 sec) mysql> USE bug17164; Database changed mysql> DROP TABLE IF EXISTS t2; Query OK, 0 rows affected, 1 warning (0.01 sec) mysql> CREATE TABLE t2 ( -> id int(11) NOT NULL, -> name varchar(20), -> PRIMARY KEY (id), -> INDEX (name) -> ) ENGINE=InnoDB; Query OK, 0 rows affected (0.02 sec) mysql> INSERT INTO t2 VALUES (1,'A1'),(2,'A2'),(3,'B'); Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> mysql> DROP TABLE IF EXISTS t1; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> CREATE TABLE t1 ( -> id int(11) NOT NULL, -> fkey int(11), -> PRIMARY KEY (id), -> FOREIGN KEY (fkey) REFERENCES t2(id) -> ) ENGINE=InnoDB; Query OK, 0 rows affected (0.01 sec) mysql> INSERT INTO t1 VALUES (1,1),(2,2),(3,2),(4,3),(5,3); Query OK, 5 rows affected (0.00 sec) Records: 5 Duplicates: 0 Warnings: 0 mysql> EXPLAIN -> SELECT -> COUNT(*) -> FROM -> t1 -> LEFT JOIN t2 ON t1.fkey = t2.id -> WHERE -> t2.name LIKE 'A%' \G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t2 type: index possible_keys: PRIMARY,name key: name key_len: 23 ref: NULL rows: 3 Extra: Using where; Using index *************************** 2. row *************************** id: 1 select_type: SIMPLE table: t1 type: ref possible_keys: fkey key: fkey key_len: 5 ref: bug17164.t2.id rows: 1 Extra: Using where; Using index 2 rows in set (0.01 sec) mysql> EXPLAIN -> SELECT -> COUNT(*) -> FROM -> t1 -> LEFT JOIN t2 ON t1.fkey = t2.id -> WHERE -> (t2.name LIKE 'A%' OR FALSE) \G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t1 type: index possible_keys: NULL key: fkey key_len: 5 ref: NULL rows: 5 Extra: Using index *************************** 2. row *************************** id: 1 select_type: SIMPLE table: t2 type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: bug17164.t1.fkey rows: 1 Extra: Using where 2 rows in set (0.01 sec) Optimizer should pre-evaluate such a simple cases as OR FALSE, at least. I'll mark this report as a verified feature request.
[27 Feb 2006 21:58]
Johan Backlund
Just thought I would try and do some debugging of mysql 5.0.18 and I am pretty sure I am seeing: (1) The OR FALSE part (traced as OR 0) actually being removed from the expression, leaving only the LIKE part. This is a good thing. (2) When tracing the statement without the OR part, it seems that the WHERE condition is combined with the JOIN condition, i.e. I am seeing output like this: WHERE:(original) ((`bug17164`.`t2`.`name` like _utf8'A%') and (`bug17164`.`t1`.`fkey` = `bug17164`.`t2`.`id`)) ...however, when tracing the statement with the OR part, I see the following in the trace: WHERE:(original) ((`bug17164`.`t2`.`name` like _utf8'A%') or 0) (which is later reduced to the like expression without the "or 0" part as I mentioned in point 1). Not sure what can be deduced from this, but it looks like the statement is being handled differently when the OR statement appears in the WHERE clauses - even though it is later optimized away. IMHO, this seems like a more accurate target for doing improvements.
[28 Feb 2006 0:16]
Johan Backlund
I am not really convinced that this is a FR and so I am changing severity back to S2. I have spent some more time debugging and I am currently looking at the Item_cond::fix_fields() implementation in item_cmpfunc.cc. The cause of the problem seems to be that the optimizer does not determine the same not_null_tables set when adding the OR FALSE. Since the OR statement uses the and_tables_cache attribute I have been trying to determine how and_tables_cache is calculated. Well, without having been able to understand all the details of the fix_fields() method I think the key line is the and_tables_cache &= tmp_table_map; Since I am not sure about the table_map I have to guess: maybe the tmp_table_map is all 0 for a numeric constant? This would then actually clear out the whole and_tables_cache which causes the optimizer to think there are NULL values and therefore it misses out using the index as it does without the OR FALSE. I will probably do some more debugging, but now I need to get some sleep. Hope this might help you to analyze and correct this problem.
[28 Feb 2006 7:33]
Johan Backlund
I just want to confirm that this is the case: not_null_tables() for the Item_int representing the FALSE (0) value is set to 0 as it does not refer to any tables. Since the loop in Item_cond::fix_values() always clears out all null_table flags (in and_tables_cache) that are not set in each item, the resulting and_tables_cache will be 0 if there is a constant value anywhere in the OR statement. This will lead to the optimizer not recognizing the not_null_table condition that allows it to execute the LEFT JOIN as an INNER JOIN and therefore it does not use the index. I am still not quite sure how to do this completely correctly, but my suggestion is to make the loop in fix_fields() to not update and_tables_cache with the not_null_tables() value of constant values. The code would be something like this: if(!item->fixed) { not_null_tables_cache|= tmp_table_map; and_tables_cache&= tmp_table_map; } ie a simple test whether the item is not a constant surrounding the original update of not_null_tables_cache and and_tables_cache. Does this look like a reasonable solution?
[28 Feb 2006 16:27]
Johan Backlund
Hello, it's me again. ;o) I have a small correction on the above code since the tests failed: *** item_cmpfunc.cc 2006-02-28 17:53:55.356044624 +0100 --- item_cmpfunc.cc.new 2006-02-28 18:09:31.547267566 +0100 *************** *** 2570,2573 **** tmp_table_map= item->not_null_tables(); ! not_null_tables_cache|= tmp_table_map; ! and_tables_cache&= tmp_table_map; const_item_cache&= item->const_item(); --- 2570,2575 ---- tmp_table_map= item->not_null_tables(); ! if(!item->const_item()) { ! not_null_tables_cache|= tmp_table_map; ! and_tables_cache&= tmp_table_map; ! } const_item_cache&= item->const_item(); Now all tests run fine. If you find this a bug, it might be wise to add a test case to the tests based on the OR FALSE construct.
[23 Mar 2006 0:46]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/4041
[24 Mar 2006 20:45]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/4147
[25 Mar 2006 2:54]
Igor Babaev
ChangeSet 1.2100 06/03/24 12:44:54 igor@rurik.mysql.com +3 -0 Fixed bug #17164. If the WHERE condition of a query contained an ORed FALSE term then the set of tables whose rows cannot serve for null complements in outer joins was determined incorrectly. It resulted in blocking possible conversions of outer joins into joins for such queries. The fix will appear in 5.0.20. Tthe patch has been merged into 5.1.
[11 Apr 2006 17:28]
Mike Hillyer
Merged into which 5.1 version?
[12 Apr 2006 4:36]
Igor Babaev
It was merged into 5.1.10
[14 Apr 2006 18:39]
Paul DuBois
Noted in 5.0.20, 5.1.10 changelogs.