Bug #51457 | Firstmatch semijoin strategy gives wrong results for certain query plans | ||
---|---|---|---|
Submitted: | 24 Feb 2010 14:24 | Modified: | 23 Nov 2010 3:20 |
Reporter: | Guilhem Bichot | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 6.0-codebase-bugfixing | OS: | Linux |
Assigned to: | Roy Lyseng | CPU Architecture: | Any |
Tags: | firstmatch, optimizer_switch, semijoin, subquery |
[24 Feb 2010 14:24]
Guilhem Bichot
[24 Feb 2010 14:25]
Guilhem Bichot
patch with debug-specific keywords, and a test file
Attachment: patch.txt (text/plain), 6.67 KiB.
[24 Feb 2010 17:07]
Guilhem Bichot
Here is a reasoning why it2 jumping to ot or nt, cannot work. When we are executing the join with plan ot,it1,nt,it2: - we read a record from ot - we read a record from it1 until we find a match for ta_ot.a=tb_it1.a - we read a record from nt until we find a match for ta_ot.a=tc_nt.a - we read a record from it2 until we find a match for ta_ot.a=tb_it2.a - if it2 has "firstmatch and jump to ot", we don't read more it2 records, and jump back to the next record of ot. So the record of ot has been joined with at most one record from nt; that is wrong, because there is a join, not semijoin, between ot and nt. It may miss records this way. - if instead it2 has "firstmatch and jump to it", we don't read more it2 records, and jump back to the next record of nt, then to it2, back to nt, etc, until we have exhausted nt, so far this is correct, then we get back to it1, which doesn't have any "firstmatch and jump back to ot" mark, so we read the next record of it1, so if we find a match in it1 again, we go to nt (from the start of nt again), then to it2 (from the start of it2 again), assuming we find matches, we will output N records if it1 has N records matching ot.a, which is wrong for a semijoin. - on the other hand, having it2 jump back to nt *and* it1 jump back to ot, would not have any of those problems.
[3 Mar 2010 6:06]
Sveta Smirnova
Thank you for the report. Verified as described.
[30 Jul 2010 11:55]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/114722 3222 Roy Lyseng 2010-07-30 Bug#51457: Firstmatch semijoin strategy gives wrong results for certain query plans When the optimizer selects a semijoin FirstMatch plan that has non-correlated tables interspersed with inner tables, the "jumps" are placed so that execution might become incorrect. Example: A query with an outer correlated table ot, a non-correlated table nt and two inner tables it1 and it2 are used in a query and the optimizer chooses the FirstMatch join order ot - it1 - nt - it2. The optimizer assigns a "jump" from table it2 to ot. This strategy will omit duplicates resulting from table nt. The fix assures that the jump from the last table in a consecutive range of inner table goes to the most immediately preceeding outer table. In this case, there are two consecutive ranges of inner tables: {it1} and {it2}. In the first range, the last inner table is it1, the jump should be to table ot. In the second range, the jump from it2 should be to nt. It is currently possible to generate such query plans, hence the bug fix includes some test cases. However, such plans are probably generated due to a problem with the join optimizer cost model. When this problem is fixed, it may be impossible to generate such FirstMatch plans, unless one can assign different costs to different tables, or another measure is used to force a specific join order. mysql-test/r/subselect_innodb.result Test results for bug#51457. mysql-test/t/subselect_innodb.test Test case for bug#51457. Notice the above comment for test validity. The test goes into subselect_innodb because it is required to use tables with one row that are not hit by const table optimization. sql/sql_select.cc In setup_semijoin_dups_elimination(), in the case for FIRST_MATCH, loop over the set of inner and non-correlated outer tables and assign jump target to the last table in each consecutive range of inner tables. The jump target should be the most immediately preceeding outer table.
[3 Aug 2010 7:42]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/114897 3222 Roy Lyseng 2010-08-03 Bug#51457: Firstmatch semijoin strategy gives wrong results for certain query plans When the optimizer selects a semijoin FirstMatch plan that has non-correlated tables interspersed with inner tables, the "jumps" are placed so that execution might become incorrect. Example: A query with an outer correlated table ot, a non-correlated table nt and two inner tables it1 and it2 are used in a query and the optimizer chooses the FirstMatch join order ot - it1 - nt - it2. The optimizer assigns a "jump" from table it2 to ot. This strategy will omit duplicates resulting from table nt. The fix assures that the jump from the last table in a consecutive range of inner table goes to the most immediately preceeding outer table (regardless of whether the outer table is correlated or not). In this case, there are two consecutive ranges of inner tables: {it1} and {it2}. In the first range, the last inner table is it1, the jump should be to table ot. In the second range, the jump from it2 should be to nt. It is currently possible to generate such query plans, hence the bug fix includes some test cases. However, such plans are probably generated due to a problem with the join optimizer cost model. When this problem is fixed, it may be impossible to generate such FirstMatch plans, unless one can assign different costs to different tables, or another measure is used to force a specific join order. mysql-test/r/subselect_innodb.result Test results for bug#51457. mysql-test/t/subselect_innodb.test Test case for bug#51457. Notice the above comment for test validity. The test goes into subselect_innodb because it is required to use tables with one row that are not hit by const table optimization. sql/sql_select.cc In setup_semijoin_dups_elimination(), in the case for FIRST_MATCH, loop over the set of inner and non-correlated outer tables and assign jump target to the last table in each consecutive range of inner tables. The jump target should be the most immediately preceeding outer table.
[4 Aug 2010 7:07]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/114976 3222 Roy Lyseng 2010-08-04 Bug#51457: Firstmatch semijoin strategy gives wrong results for certain query plans When the optimizer selects a semijoin FirstMatch plan that has non-correlated tables interspersed with inner tables, the "jumps" are placed so that execution might become incorrect. Example: A query with an outer correlated table ot, a non-correlated table nt and two inner tables it1 and it2 are used in a query and the optimizer chooses the FirstMatch join order ot - it1 - nt - it2. The optimizer assigns a "jump" from table it2 to ot. This strategy will omit duplicates resulting from table nt. The fix assures that the jump from the last table in a consecutive range of inner table goes to the most immediately preceeding outer table (regardless of whether the outer table is correlated or not). In this case, there are two consecutive ranges of inner tables: {it1} and {it2}. In the first range, the last inner table is it1, the jump should be to table ot. In the second range, the jump from it2 should be to nt. It is currently possible to generate such query plans, hence the bug fix includes some test cases. However, such plans are probably generated due to a problem with the join optimizer cost model. When this problem is fixed, it may be impossible to generate such FirstMatch plans, unless one can assign different costs to different tables, or another measure is used to force a specific join order. mysql-test/r/subselect_innodb.result Test results for bug#51457. mysql-test/t/subselect_innodb.test Test case for bug#51457. Notice the above comment for test validity. The test goes into subselect_innodb because it is required to use tables with one row that are not hit by const table optimization. sql/sql_select.cc In setup_semijoin_dups_elimination(), in the case for FIRST_MATCH, loop over the set of inner and non-correlated outer tables and assign jump target to the last table in each consecutive range of inner tables. The jump target should be the most immediately preceeding outer table.
[2 Oct 2010 18:15]
Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101002181053-6iotvl26uurcoryp) (version source revid:alexander.nozdrin@oracle.com-20101002180917-h0n62akupm3z20nt) (pib:21)
[13 Nov 2010 16:23]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[23 Nov 2010 3:20]
Paul DuBois
Bug does not appear in any released 5.6.x version. No 5.6.1 changelog entry needed.