Bug #50358 semijoin execution of subquery with outerjoin yields wrong result
Submitted: 15 Jan 2010 10:01 Modified: 23 Nov 2010 3:13
Reporter: Tor Didriksen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:6.0-codebase OS:Any
Assigned to: Guilhem Bichot CPU Architecture:Any
Tags: join_cache_level, optimizer_switch, semijoin, subquery

[15 Jan 2010 10:01] Tor Didriksen
Description:
CREATE TABLE t1 (i INTEGER);
CREATE TABLE t2 (i INTEGER);
CREATE TABLE t3 (i INTEGER);
INSERT INTO t1 VALUES (1), (2), (3), (5);
INSERT INTO t3 VALUES (1), (2), (4), (5);
SELECT * FROM t1 WHERE (t1.i) IN 
(SELECT t3.i FROM t3 LEFT JOIN t2 ON t2.i=t3.i);
i
1
2
5
DROP TABLE t1, t2, t3;

The same query with
set join_cache_level=6;
yields result
i
5
5
5
5
i.e. wrong cardinality, and wrong values.

How to repeat:
See description.
[15 Jan 2010 10:35] Sveta Smirnova
Thank you for the report.

Verified as described:

=====mysql-6.0-codebase=====
=====bug50358=====
CREATE TABLE t1 (i INTEGER);
CREATE TABLE t2 (i INTEGER);
CREATE TABLE t3 (i INTEGER);
INSERT INTO t1 VALUES (1), (2), (3), (5);
INSERT INTO t3 VALUES (1), (2), (4), (5);
SELECT * FROM t1 WHERE (t1.i) IN 
(SELECT t3.i FROM t3 LEFT JOIN t2 ON t2.i=t3.i);
i
1
2
5
set join_cache_level=6;
SELECT * FROM t1 WHERE (t1.i) IN 
(SELECT t3.i FROM t3 LEFT JOIN t2 ON t2.i=t3.i);
i
5
5
5
5
DROP TABLE t1, t2, t3;
[3 Feb 2010 14:04] Guilhem Bichot
According to bzrfind, this bug appeared in

sergefp@mysql.com-20081222190325-dml9u9zq4kxi0p9j
WL#4688: Merge Batched Key Access and Subquery Optimizations, Step#1
[10 Feb 2010 15:17] Guilhem Bichot
query fails the same way with optimizer_join_cache_level in (4,6,8), and not with other values.
When value <=2 join buffering is not used for semijoin / outer join, which we have here, so it's clear why <=2 works.
Regarding why 3,5,7 work, it's likely due to "If the value of join_cache_level is odd then creation of a non-linked join cache is forced.", the doc also says that odd values use regular buffers whereas even values use incremental buffers. So it may have to do with incremental buffers.
According to the doc, BKA starts at 5, and here 4 is buggy too, so it may be more related to incremental buffers than with BKA.
[22 Apr 2010 14:26] Guilhem Bichot
not working on it since months
[21 May 2010 14:08] Guilhem Bichot
bug is not repeatable with guilhem@mysql.com-20100519135042-hok0n23rvzb0nba5 (neither with the next-mr-opt-backporting tree); now searching for the bugfix with bzrfind.
[21 May 2010 19:06] Guilhem Bichot
according to bzrfind, this was fixed by
oystein.grovlen@sun.com-20100417063402-gmwfvs9mmu2u714k
i.e. the fix for BUG#49952.
Looking at that fix, how it could relate to the present bug is not clear. Maybe it's because t2 is constant (has no row).
Still not convinced that the present bug is fixed, it may be hidden, but might be resurrected if we change the testcase a bit (add rows in the proper table?). That should be checked, so I'm setting it back to "verified".
[24 May 2010 8:54] Guilhem Bichot
Indeed, make t2 non-empty and we see the bug again
(guilhem@mysql.com-20100521115600-udgr0fwe1i8hf91e next-mr-opt-backporting):

CREATE TABLE t1 (i INTEGER);
CREATE TABLE t2 (i INTEGER);
CREATE TABLE t3 (i INTEGER);
INSERT INTO t1 VALUES (1), (2), (3), (5);
INSERT INTO t3 VALUES (1), (2), (4), (5);
INSERT INTO t2 VALUES (6);
SELECT * FROM t1 WHERE (t1.i) IN 
(SELECT t3.i FROM t3 LEFT JOIN t2 ON t2.i=t3.i);
set optimizer_join_cache_level=6;
SELECT * FROM t1 WHERE (t1.i) IN 
(SELECT t3.i FROM t3 LEFT JOIN t2 ON t2.i=t3.i);

gives result
SELECT * FROM t1 WHERE (t1.i) IN 
(SELECT t3.i FROM t3 LEFT JOIN t2 ON t2.i=t3.i);
i
1
2
5
set optimizer_join_cache_level=6;
SELECT * FROM t1 WHERE (t1.i) IN 
(SELECT t3.i FROM t3 LEFT JOIN t2 ON t2.i=t3.i);
i
5
5
5
5
^ this is incorrect
[24 May 2010 8:54] Guilhem Bichot
wrong revision id above, I am using epotemkin@mysql.com-20100520092826-xg2i2724iwjlzv8i
[24 May 2010 8:57] Guilhem Bichot
optimizer_join_cache_level=4 shows bug too. So it's not about BKA. Value 4 is described as
"The BNL algorithm uses incremental buffers for inner tables. In this case, the BNL algorithm can be used for nested outer joins and semi-joins (outer joins and semi-joins with several inner tables). Such an operation can be executed only if incremental join buffers are used to join all inner tables but the first one."
[30 May 2010 19:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109560

3183 Guilhem Bichot	2010-05-30
      Fix for BUG#50358 "semijoin execution of subquery with outerjoin yields wrong result"
      and BUG#51018 "Batched key access gives wrong results for SELECT with semijoin and NULL":
      semijoin materialization was not fully disabled when doing join buffering.
      This is the first possible fix, it has the advantage of changing little code,
      and the disadvantage of adding some loops which in total scale like 0.5*(N^2)
      where N is the number of tables in the join (exception: no added cost if
      there are no semijoins in the query). Another fix with a different approach
      will be submitted next.
     @ mysql-test/r/subselect3.result
        those plans accidentally had join buffering enabled for the non-first
        inner tables in a semijoin materialization nest
     @ mysql-test/r/subselect3_jcl6.result
        those plans accidentally had join buffering enabled for the non-first
        inner tables in a semijoin materialization nest.
        The test where one 256,67,NULL line disappears, is exactly the test for BUG#51018.
     @ mysql-test/r/subselect4.result
        those plans accidentally had join buffering enabled for the non-first
        inner tables in a semijoin materialization nest
     @ mysql-test/r/subselect_sj.result
        those plans accidentally had join buffering enabled for the non-first
        inner tables in a semijoin materialization nest
     @ mysql-test/r/subselect_sj2.result
        result for test
     @ mysql-test/r/subselect_sj2_jcl6.result
        before this fix, the result of SELECT would be 2 and 2 which is wrong
     @ mysql-test/r/subselect_sj_jcl6.result
        those plans accidentally had join buffering enabled for the non-first
        inner tables in a semijoin materialization nest
     @ mysql-test/t/subselect_sj2.test
        test for BUG#51018
     @ sql/sql_select.cc
        In check_join_cache_usage(), a TODO comment reflects that we don't support join buffering
        in semijoin inner tables handled with materialization. This was already enforced with test
        sj_is_materialize_strategy(join->best_positions[i].sj_strategy); but, as the strategy
        is SJ_OPT_NONE for all non-first inner tables, this test let non-first inner tables
        do materialization. That led to wrong results. The fix is to properly enforce
        the limitation: scan previous JOIN_TABs until finding a first inner table
        with materialization strategy. We do this search only if there are semijoin nests.
        Another existing limitation, for FirstMatch, is groupped under the same if(),
        as an optimization.
[30 May 2010 19:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109561

3183 Guilhem Bichot	2010-05-30
      Fix for BUG#50358 "semijoin execution of subquery with outerjoin yields wrong result"
      and BUG#51018 "Batched key access gives wrong results for SELECT with semijoin and NULL":
      semijoin materialization was not fully disabled when doing join buffering.
      This is the second possible fix, which stores more in order to compute less. Test
      results are identical to the ones in the first fix.
     @ sql/sql_select.cc
        Semijoin materializations now set JOIN_TAB::first_sj_inner_tab and
        JOIN_TAB::last_sj_inner_tab.
        In check_join_cache_usage(), we want to disable join buffering if the
        table is in semijoin materialization (see the TODO in the function's
        comment). The semijoin strategy is however stored only in the first
        semijoin inner table, other inner ones have SJ_OPT_NONE (see end of 
        fix_semijoin_strategies_for_picked_join_order()). So when 
        check_join_cache_usage() was looking at a non-first inner table,
        sj_is_materialize_strategy() said "no", in the end join buffering was
        not disabled for that non-first inner table, leading to wrong results.
        To find the strategy for the table, we need to access the first inner
        table: we can now do so with tab->first_sj_inner_tab (now filled),
        through get_sj_strategy().
        All existing code which implicitely applied only to firstmatch inner
        tables, still does so.
     @ sql/sql_select.h
        JOIN_TAB::first_sj_inner_tab and JOIN_TAB::last_sj_inner_tab were set
        only for firstmatch. As the same information is useful for
        check_join_cache_usage() to detect semijoin materialization, we now set
        it for this strategy too. Thus, existing code which tested
        first_sj_inner_tab (implicitely testing for firstmatch) should now test
        first_sj_inner_tab_with_firstmatch() to not change behaviour.
[4 Jun 2010 12:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/110234

3183 Guilhem Bichot	2010-06-04
      Fix for BUG#50358 "semijoin execution of subquery with outerjoin yields wrong result"
      and BUG#51018 "Batched key access gives wrong results for SELECT with semijoin and NULL":
      semijoin materialization was not fully disabled when doing join buffering.
      This is the second possible fix, which stores more in order to compute less. Test
      results are identical to the ones in the first fix.
      This contains more changes, suggested by first reviewer.
     @ sql/sql_join_cache.cc
        * JOIN_TAB::use_match_flag(), is_last_inner_table(), get_first_inner_table()
        coded in-line in JOIN_CACHE, as they are quite specific to JOIN_CACHE
        * cache value of check_only_first_match
     @ sql/sql_select.cc
        * "join->join_tab + i + pos->n_sj_tables - 1" was used in many places in
        setup_semijoin_dups_elimination(), computing it once now
        * All semijoin strategies now set JOIN_TAB::first_sj_inner_tab and
        JOIN_TAB::last_sj_inner_tab.
        In check_join_cache_usage(), we want to disable join buffering if the
        table is in semijoin materialization (see the TODO in the function's
        comment). The semijoin strategy is however stored only in the first
        semijoin inner table, other inner ones have SJ_OPT_NONE (see end of 
        fix_semijoin_strategies_for_picked_join_order()). So when 
        check_join_cache_usage() was looking at a non-first inner table,
        sj_is_materialize_strategy() said "no", in the end join buffering was
        not disabled for that non-first inner table, leading to wrong results.
        To find the strategy for the table, we need to access the first inner
        table: we can now do so with tab->first_sj_inner_tab (now filled),
        through get_sj_strategy().
        All existing code which implicitely applied only to firstmatch inner
        tables, still does so, because a test for
        get_sj_strategy()==SJ_OPT_FIRST_MATCH is added
     @ sql/sql_select.h
        * JOIN_TAB::first_sj_inner_tab and JOIN_TAB::last_sj_inner_tab were set
        only for firstmatch. As the same information is useful for
        check_join_cache_usage() to detect semijoin materialization, we now set
        it for all semijoin strategies. A member function JOIN_TAB::get_sj_strategy()
        gives the table's semijoin strategy. Thus, existing code which used
        first_sj_inner_tab (implicitely testing for firstmatch) should now additionally
        test get_sj_strategy()==SJ_OPT_FIRST_MATCH, to not change behaviour.
        * is_inner_table_of_semi_join_with_first_match() is removed as it's one-line,
        not very general and replacable by get_sj_strategy()==SJ_OPT_FIRST_MATCH.
        * is_single_inner_of_semi_join_with_first_match() is replaced by
        is_single_inner_of_semi_join() (more general).
        * certain member functions used only by JOIN_CACHE, and quite specific of it
        (use_match_flag(),check_only_first_match()) are either removed and inserted
        in-line in JOIN_CACHE's code, or moved to being JOIN_CACHE member functions:
        that's the case for check_only_first_match().
        Same for is_last_inner_table() and get_first_inner_table(), which had
        misleading names (they apply only to semijoin inner tables served by
        FirstMatch, not all semijoin inner tables); this behaviour being caused
        by JOIN_CACHE, it's better if those specifics are inside JOIN_CACHE code
        and not in JOIN_TAB (putting them in JOIN_TAB suggested that they
        were generic-purpose, reusable functions).
        * Additionally, we add JOIN_CACHE::check_only_first_match, a cached value
        to avoid computing its value for each record (see sql_join_cache.cc)
[7 Jun 2010 15:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/110369

3183 Guilhem Bichot	2010-06-07
      Fix for BUG#50358 "semijoin execution of subquery with outerjoin yields wrong result"
      and BUG#51018 "Batched key access gives wrong results for SELECT with semijoin and NULL":
      semijoin materialization was not fully disabled when doing join buffering.
      This is the second possible fix, which stores more in order to compute less. Test
      results are identical to the ones in the first fix.
      This contains more changes, suggested by first reviewer.
     @ sql/sql_join_cache.cc
        * JOIN_TAB::use_match_flag(), is_last_inner_table(), get_first_inner_table()
        coded in-line in JOIN_CACHE, as they are quite specific to JOIN_CACHE
        * cache value of check_only_first_match
     @ sql/sql_select.cc
        * "join->join_tab + i + pos->n_sj_tables - 1" was used in many places in
        setup_semijoin_dups_elimination(), computing it once now
        * All semijoin strategies now set JOIN_TAB::first_sj_inner_tab and
        JOIN_TAB::last_sj_inner_tab.
        In check_join_cache_usage(), we want to disable join buffering if the
        table is in semijoin materialization (see the TODO in the function's
        comment). The semijoin strategy is however stored only in the first
        semijoin inner table, other inner ones have SJ_OPT_NONE (see end of 
        fix_semijoin_strategies_for_picked_join_order()). So when 
        check_join_cache_usage() was looking at a non-first inner table,
        sj_is_materialize_strategy() said "no", in the end join buffering was
        not disabled for that non-first inner table, leading to wrong results.
        To find the strategy for the table, we need to access the first inner
        table: we can now do so with tab->first_sj_inner_tab (now filled),
        through get_sj_strategy().
        All existing code which implicitely applied only to firstmatch inner
        tables, still does so, because a test for
        get_sj_strategy()==SJ_OPT_FIRST_MATCH is added
     @ sql/sql_select.h
        * JOIN_TAB::first_sj_inner_tab and JOIN_TAB::last_sj_inner_tab were set
        only for firstmatch. As the same information is useful for
        check_join_cache_usage() to detect semijoin materialization, we now set
        it for all semijoin strategies. A member function JOIN_TAB::get_sj_strategy()
        gives the table's semijoin strategy. Thus, existing code which used
        first_sj_inner_tab (implicitely testing for firstmatch) should now additionally
        test get_sj_strategy()==SJ_OPT_FIRST_MATCH, to not change behaviour.
        * is_inner_table_of_semi_join_with_first_match() is removed as it's one-line,
        not very general and replacable by get_sj_strategy()==SJ_OPT_FIRST_MATCH.
        * is_single_inner_of_semi_join_with_first_match() is replaced by
        is_single_inner_of_semi_join() (more general).
        * certain member functions used only by JOIN_CACHE, and quite specific of it
        (use_match_flag(),check_only_first_match()) are either removed and inserted
        in-line in JOIN_CACHE's code, or moved to being JOIN_CACHE member functions:
        that's the case for check_only_first_match().
        Same for is_last_inner_table() and get_first_inner_table(), which had
        misleading names (they apply only to semijoin inner tables served by
        FirstMatch, not all semijoin inner tables); this behaviour being caused
        by JOIN_CACHE, it's better if those specifics are inside JOIN_CACHE code
        and not in JOIN_TAB (putting them in JOIN_TAB suggested that they
        were generic-purpose, reusable functions).
        * Additionally, we add JOIN_CACHE::check_only_first_match, a cached value
        to avoid computing its value for each record (see sql_join_cache.cc)
[11 Jun 2010 12:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/110822

3192 Guilhem Bichot	2010-06-11
      Fix for BUG#50358 "semijoin execution of subquery with outerjoin yields wrong result"
      and BUG#51018 "Batched key access gives wrong results for SELECT with semijoin and NULL":
      semijoin materialization was not fully disabled when doing join buffering.
     @ mysql-test/r/subselect3.result
        those plans accidentally had join buffering enabled for the non-first
        inner tables in a semijoin materialization nest
     @ mysql-test/r/subselect3_jcl6.result
        those plans accidentally had join buffering enabled for the non-first
        inner tables in a semijoin materialization nest.
        The test where one 256,67,NULL line disappears, is exactly the test for BUG#51018.
     @ mysql-test/r/subselect4.result
        those plans accidentally had join buffering enabled for the non-first
        inner tables in a semijoin materialization nest
     @ mysql-test/r/subselect_sj.result
        those plans accidentally had join buffering enabled for the non-first
        inner tables in a semijoin materialization nest
     @ mysql-test/r/subselect_sj2.result
        result for test
     @ mysql-test/r/subselect_sj2_jcl6.result
        before this fix, the result of SELECT would be 2 and 2 which is wrong
     @ mysql-test/r/subselect_sj2_jcl7.result
        result for test
     @ mysql-test/r/subselect_sj_jcl6.result
        those plans accidentally had join buffering enabled for the non-first
        inner tables in a semijoin materialization nest.
        The test where 1 changes to 2 is another symptom of BUG#50358.
     @ mysql-test/r/subselect_sj_jcl7.result
        those plans accidentally had join buffering enabled for the non-first
        inner tables in a semijoin materialization nest
     @ mysql-test/t/subselect_sj2.test
        test for BUG#51018
     @ sql/sql_join_cache.cc
        * JOIN_TAB::use_match_flag(), is_last_inner_table(), get_first_inner_table()
        coded in-line in JOIN_CACHE, as they are quite specific to JOIN_CACHE
        * cache value of check_only_first_match
     @ sql/sql_select.cc
        * "join->join_tab + i + pos->n_sj_tables - 1" was used in many places in
        setup_semijoin_dups_elimination(), computing it once now
        * All semijoin strategies now set JOIN_TAB::first_sj_inner_tab and
        JOIN_TAB::last_sj_inner_tab.
        In check_join_cache_usage(), we want to disable join buffering if the
        table is in semijoin materialization (see the TODO in the function's
        comment). The semijoin strategy is however stored only in the first
        semijoin inner table, other inner ones have SJ_OPT_NONE (see end of 
        fix_semijoin_strategies_for_picked_join_order()). So when 
        check_join_cache_usage() was looking at a non-first inner table,
        sj_is_materialize_strategy() said "no", in the end join buffering was
        not disabled for that non-first inner table, leading to wrong results.
        To find the strategy for the table, we need to access the first inner
        table: we can now do so with tab->first_sj_inner_tab (now filled),
        through get_sj_strategy().
        All existing code which implicitely applied only to firstmatch inner
        tables, still does so, because a test for
        get_sj_strategy()==SJ_OPT_FIRST_MATCH is added
     @ sql/sql_select.h
        * JOIN_TAB::first_sj_inner_tab and JOIN_TAB::last_sj_inner_tab were set
        only for firstmatch. As the same information is useful for
        check_join_cache_usage() to detect semijoin materialization, we now set
        it for all semijoin strategies. A member function JOIN_TAB::get_sj_strategy()
        gives the table's semijoin strategy. Thus, existing code which used
        first_sj_inner_tab (implicitely testing for firstmatch) should now additionally
        test get_sj_strategy()==SJ_OPT_FIRST_MATCH, to not change behaviour.
        * is_inner_table_of_semi_join_with_first_match() is removed as it's one-line,
        not very general and replacable by get_sj_strategy()==SJ_OPT_FIRST_MATCH.
        * is_single_inner_of_semi_join_with_first_match() is replaced by
        is_single_inner_of_semi_join() (more general).
        * certain member functions used only by JOIN_CACHE, and quite specific of it
        (use_match_flag(),check_only_first_match()) are either removed and inserted
        in-line in JOIN_CACHE's code, or moved to being JOIN_CACHE member functions:
        that's the case for check_only_first_match().
        Same for is_last_inner_table() and get_first_inner_table(), which had
        misleading names (they apply only to semijoin inner tables served by
        FirstMatch, not all semijoin inner tables); this behaviour being caused
        by JOIN_CACHE, it's better if those specifics are inside JOIN_CACHE code
        and not in JOIN_TAB (putting them in JOIN_TAB suggested that they
        were generic-purpose, reusable functions).
        * Additionally, we add JOIN_CACHE::check_only_first_match, a cached value
        to avoid computing its value for each record (see sql_join_cache.cc)
[11 Jun 2010 12:28] Guilhem Bichot
queued to next-mr-opt-backporting
[11 Jun 2010 12:29] Guilhem Bichot
Evgeny's approval is recorded in BUG#51018 which is the same bug.
[14 Jun 2010 6:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/110966

3193 Tor Didriksen	2010-06-14
      Post push fix for BUG#50358, dont inline JOIN_TAB::get_sj_strategy
[16 Aug 2010 6:36] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:12] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[23 Nov 2010 3:13] Paul DuBois
Bug does not appear in any released 5.6.x version. No 5.6.1 changelog entry needed.