Bug #53298 wrong result with semijoin (no semijoin strategy chosen)
Submitted: 29 Apr 2010 21:15 Modified: 23 Nov 2010 3:28
Reporter: Guilhem Bichot Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:6.0-codebase-bugfixing OS:Any
Assigned to: Guilhem Bichot CPU Architecture:Any
Tags: optimizer_switch, semijoin
Triage: Triaged: D2 (Serious)

[29 Apr 2010 21:15] Guilhem Bichot
Description:
I have
alfranio.correia@sun.com-20100426092226-bnkzpjs8uqkl7l16

create table t1 (
  uid int, fid int, index(uid)
) engine=myisam;
insert into t1 values
  (1,1), (1,2), (1,3), (1,4),
  (2,5), (2,6), (2,7), (2,8),
  (3,1), (3,2), (3,9);

create table t2 (
  uid int primary key, name varchar(128), index(name)
) engine=myisam;
insert into t2 values 
  (1, "A"), (2, "B"), (3, "C"), (4, "D"), (5, "E"),
  (6, "F"), (7, "G"), (8, "H"), (9, "I");

create table t3 (
  uid int, fid int, index(uid)
) engine=myisam;
insert into t3 values
  (1,1), (1,2), (1,3), (1,3),(1,4),
  (2,5), (2,6), (2,7), (2,8),
  (3,1), (3,2), (3,9);

create table t4 (
  uid int primary key, name varchar(128), index(name)
) engine=myisam;
insert into t4 values 
  (1, "A"), (2, "B"), (3, "C"), (4, "D"), (5, "E"),
  (6, "F"), (7, "G"), (8, "H"), (9, "I");

# this returns 1,2,3,3,4
select t4.uid from t4, t3 where t3.uid=1 and t4.uid=t3.fid;

explain select name from t2, t1 
  where t1.uid in (select t4.uid from t4, t3 where t3.uid=1 and t4.uid=t3.fid)
        and t2.uid=t1.fid;

--sorted_result
select name from t2, t1 
  where t1.uid in (select t4.uid from t4, t3 where t3.uid=1 and t4.uid=t3.fid)
        and t2.uid=t1.fid; # buggy

# should be same
--sorted_result
select name from t2, t1 
  where t1.uid in (1,3,2,3,4)
        and t2.uid=t1.fid; # correct

drop table t1,t2,t3,t4;

I observe that the "buggy" SELECT returns too many rows; for example A is returned 3 times though it should only be returned two times as in the "correct" SELECT. B and I are also returned too many times.
Goes away with optimimizer_switch=semijoin=off.
Unaffected by optimimizer_switch=X=off where X is materialization, firstmatch, loosescan.

I don't know if it's related to BUG#31480.

How to repeat:
see description

Suggested fix:
Interestingly, EXPLAIN shows:
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY t3      ref     uid     uid     5       const   5       Using where
1       PRIMARY t1      ALL     uid     NULL    NULL    NULL    11      Using where; Using join buffer
1       PRIMARY t4      eq_ref  PRIMARY PRIMARY 4       test.t3.fid     1       Using index
1       PRIMARY t2      eq_ref  PRIMARY PRIMARY 4       test.t1.fid     1       

we see no semijoin strategy. Logically, this means everything has been pulled out.
[29 Apr 2010 23:29] Godofredo Miguel Solorzano
Thank you for the bug report. Verified as described.
[30 Apr 2010 9:04] Guilhem Bichot
maybe not related to BUG#31480, at least the patch submitted in that bug report doesn't help here.
[4 May 2010 20:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107392

3863 Guilhem Bichot	2010-05-04
      Fix for BUG#53298 "wrong result with semijoin (no semijoin strategy chosen)".
      Reviewers: please see questions marked with "QQ"
     @ mysql-test/r/subselect3.result
        fix causes slight plan change
     @ mysql-test/r/subselect4.result
        before the code fix, the final select would return two rows (1 and 1).
     @ mysql-test/r/subselect_sj2.result
        Before the code fix, EXPLAIN showed a wrong plan: adding two Australian towns to table t1, each of them
        having a population greater than 5MB, led the SELECT to return Australia twice (duplicate);
        after this fix, Australia would be returned once.
     @ mysql-test/t/subselect4.test
        test for bug
     @ sql/sql_select.cc
        In the bug's scenario, t3 and t4 are inner tables of the semijoin. t4 is pulled out
        (functionally dependent), but t3 is not. As we thus have one inner table left,
        we need a semijoin strategy; without semijoin strategy, the semijoin just becomes
        a plain join, leading to duplicate rows. Why didn't we have a semijoin strategy:
        - plan search would first choose this order: t4,t3,t1,t2, with duplicate weedout
        as semijoin strategy. As this handles the semijoin fully, join->cur_dups_producing_tables
        is 0 (as a result of choosing duplicates weedout in advance_sj_state() for plan's last
        table t2).
        - then plan search evaluates plan t4,t3,t2,t1: it first computes its cost, then in
        avdance_sj_state(), as join->cur_dups_producing_tables is still 0 (the bug),
        it believes that semijoin inner tables have been handled already, so doesn't feel
        obliged to choose a semijoin strategy, so this plan has no strategy, and as its
        cost is lower than that of t4,t3,t1,t2, it is chosen. A plan without a strategy,
        thus duplicates.
        The fix: as in advance_sj_state() we modify certain JOIN members, we need to
        restore them when we back-track to other plans. backout_nj_sj_state() already
        restored cur_sj_inner_tables, but forgot to restore cur_dups_producing_tables.
        We here make a class which contains all that backout_nj_sj_state() should
        restore.
[14 May 2010 11:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/108313

3863 Guilhem Bichot	2010-05-14
      Fix for BUG#53298 "wrong result with semijoin (no semijoin strategy chosen)".
     @ mysql-test/r/subselect3.result
        Without the fix, we were comparing the cost of first-considered plan
        "t11,t12,t21,t22" (with materialization scan for (t11,t12)) with the
        cost of partial plan "t11,t12,t22" (inner joins). Yes, inner joins,
        because when we considered that partial plan, cur_dups_producing_tables
        was 0 (had not been reset, which is the bug): this said that no
        semijoin strategy was needed so inner join was used.
        "t11,t12,t22" was then pruned by the heuristic of
        optimizer_prune_level=1, because it had a higher record count.
        With the fix, when considering "t11,t12,t22" we correctly realize that
        this needs a semijoin strategy, choose materialization scan, and go
        down to "t11,t12,t22,t21", which is less costly than
        "t11,t12,t21,t22" (11.986 vs 12). Hence the plan change.
     @ mysql-test/r/subselect4.result
        before the code fix, the final select would return two rows (1 and 1).
     @ mysql-test/r/subselect_sj2.result
        Before the code fix, EXPLAIN showed a wrong plan: adding two Australian towns to table t1, each of them
        having a population greater than 5MB, led the SELECT to return Australia twice (duplicate);
        after this fix, Australia would be returned once.
     @ mysql-test/t/subselect4.test
        test for bug
     @ sql/sql_select.cc
        In the bug's scenario, t3 and t4 are inner tables of the semijoin. t4 is pulled out
        (functionally dependent), but t3 is not. As we thus have one inner table left,
        we need a semijoin strategy; without semijoin strategy, the semijoin just becomes
        a plain join, leading to duplicate rows. Why didn't we have a semijoin strategy:
        - plan search would first choose this order: t4,t3,t1,t2, with duplicate weedout
        as semijoin strategy. As this handles the semijoin fully, join->cur_dups_producing_tables
        is 0 (as a result of choosing duplicates weedout in advance_sj_state() for plan's last
        table t2).
        - then plan search evaluates plan t4,t3,t2,t1: it first computes its cost, then in
        avdance_sj_state(), as join->cur_dups_producing_tables is still 0 (the bug),
        it believes that semijoin inner tables have been handled already, so doesn't feel
        obliged to choose a semijoin strategy, so this plan has no strategy, and as its
        cost is lower than that of t4,t3,t1,t2, it is chosen. A plan without a strategy,
        thus duplicates.
        The fix: move JOIN::cur_dups_producing_tables to POSITION (which tracks the
        optimization state at the partial plan's stage); as the function
        comment of advance_sj_state() says, "no undo is necessary" then. When
        we are about to look at the first table, we don't have duplicate-producing
        tables yet (=0); when we go to table N+1, we inherit the same duplicate-
        producing tables (=pos[-1].etc), and we add more tables if we hit a
        semijoin nest, and we remove tables when we pick a semijoin strategy.
     @ sql/sql_select.h
        move JOIN::cur_dups_producing_tables to POSITION
[20 May 2010 10:04] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100520100225-oe4iuu5kuzsx0knq) (version source revid:alik@sun.com-20100520100057-rmn5y3o3ij726bm7) (merge vers: 6.0.14-alpha) (pib:16)
[21 May 2010 14:18] Guilhem Bichot
this has reached 6.0-codebase as well as the next-mr-opt-backporting tree.
[21 May 2010 14:20] Guilhem Bichot
"For some queries having subqueries in the WHERE clause (semijoin), the semijoin was wrongly transformed into an inner join leading to excess rows in the result"
[22 May 2010 14:20] Paul Dubois
Noted in 6.0.14 changelog.
[16 Aug 2010 6:39] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:14] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[23 Nov 2010 3:28] Paul Dubois
Bug does not appear in any released 5.6.x version. No 5.6.1 changelog entry needed.