MySQL Bugs: #53298: wrong result with semijoin (no semijoin strategy chosen)

Bug #53298	wrong result with semijoin (no semijoin strategy chosen)
Submitted:	29 Apr 2010 21:15	Modified:	23 Nov 2010 3:28
Reporter:	Guilhem Bichot	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	6.0-codebase-bugfixing	OS:	Any
Assigned to:	Guilhem Bichot	CPU Architecture:	Any
Tags:	optimizer_switch, semijoin

Description:
I have
alfranio.correia@sun.com-20100426092226-bnkzpjs8uqkl7l16

create table t1 (
  uid int, fid int, index(uid)
) engine=myisam;
insert into t1 values
  (1,1), (1,2), (1,3), (1,4),
  (2,5), (2,6), (2,7), (2,8),
  (3,1), (3,2), (3,9);

create table t2 (
  uid int primary key, name varchar(128), index(name)
) engine=myisam;
insert into t2 values 
  (1, "A"), (2, "B"), (3, "C"), (4, "D"), (5, "E"),
  (6, "F"), (7, "G"), (8, "H"), (9, "I");

create table t3 (
  uid int, fid int, index(uid)
) engine=myisam;
insert into t3 values
  (1,1), (1,2), (1,3), (1,3),(1,4),
  (2,5), (2,6), (2,7), (2,8),
  (3,1), (3,2), (3,9);

create table t4 (
  uid int primary key, name varchar(128), index(name)
) engine=myisam;
insert into t4 values 
  (1, "A"), (2, "B"), (3, "C"), (4, "D"), (5, "E"),
  (6, "F"), (7, "G"), (8, "H"), (9, "I");

# this returns 1,2,3,3,4
select t4.uid from t4, t3 where t3.uid=1 and t4.uid=t3.fid;

explain select name from t2, t1 
  where t1.uid in (select t4.uid from t4, t3 where t3.uid=1 and t4.uid=t3.fid)
        and t2.uid=t1.fid;

--sorted_result
select name from t2, t1 
  where t1.uid in (select t4.uid from t4, t3 where t3.uid=1 and t4.uid=t3.fid)
        and t2.uid=t1.fid; # buggy

# should be same
--sorted_result
select name from t2, t1 
  where t1.uid in (1,3,2,3,4)
        and t2.uid=t1.fid; # correct

drop table t1,t2,t3,t4;

I observe that the "buggy" SELECT returns too many rows; for example A is returned 3 times though it should only be returned two times as in the "correct" SELECT. B and I are also returned too many times.
Goes away with optimimizer_switch=semijoin=off.
Unaffected by optimimizer_switch=X=off where X is materialization, firstmatch, loosescan.

I don't know if it's related to BUG#31480.

How to repeat:
see description

Suggested fix:
Interestingly, EXPLAIN shows:
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY t3      ref     uid     uid     5       const   5       Using where
1       PRIMARY t1      ALL     uid     NULL    NULL    NULL    11      Using where; Using join buffer
1       PRIMARY t4      eq_ref  PRIMARY PRIMARY 4       test.t3.fid     1       Using index
1       PRIMARY t2      eq_ref  PRIMARY PRIMARY 4       test.t1.fid     1       

we see no semijoin strategy. Logically, this means everything has been pulled out.

Thank you for the bug report. Verified as described.

maybe not related to BUG#31480, at least the patch submitted in that bug report doesn't help here.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107392

3863 Guilhem Bichot	2010-05-04
      Fix for BUG#53298 "wrong result with semijoin (no semijoin strategy chosen)".
      Reviewers: please see questions marked with "QQ"
     @ mysql-test/r/subselect3.result
        fix causes slight plan change
     @ mysql-test/r/subselect4.result
        before the code fix, the final select would return two rows (1 and 1).
     @ mysql-test/r/subselect_sj2.result
        Before the code fix, EXPLAIN showed a wrong plan: adding two Australian towns to table t1, each of them
        having a population greater than 5MB, led the SELECT to return Australia twice (duplicate);
        after this fix, Australia would be returned once.
     @ mysql-test/t/subselect4.test
        test for bug
     @ sql/sql_select.cc
        In the bug's scenario, t3 and t4 are inner tables of the semijoin. t4 is pulled out
        (functionally dependent), but t3 is not. As we thus have one inner table left,
        we need a semijoin strategy; without semijoin strategy, the semijoin just becomes
        a plain join, leading to duplicate rows. Why didn't we have a semijoin strategy:
        - plan search would first choose this order: t4,t3,t1,t2, with duplicate weedout
        as semijoin strategy. As this handles the semijoin fully, join->cur_dups_producing_tables
        is 0 (as a result of choosing duplicates weedout in advance_sj_state() for plan's last
        table t2).
        - then plan search evaluates plan t4,t3,t2,t1: it first computes its cost, then in
        avdance_sj_state(), as join->cur_dups_producing_tables is still 0 (the bug),
        it believes that semijoin inner tables have been handled already, so doesn't feel
        obliged to choose a semijoin strategy, so this plan has no strategy, and as its
        cost is lower than that of t4,t3,t1,t2, it is chosen. A plan without a strategy,
        thus duplicates.
        The fix: as in advance_sj_state() we modify certain JOIN members, we need to
        restore them when we back-track to other plans. backout_nj_sj_state() already
        restored cur_sj_inner_tables, but forgot to restore cur_dups_producing_tables.
        We here make a class which contains all that backout_nj_sj_state() should
        restore.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

http://lists.mysql.com/commits/108313

3863 Guilhem Bichot 2010-05-14
Fix for BUG#53298 "wrong result with semijoin (no semijoin strategy chosen)".
@ mysql-test/r/subselect3.result
Without the fix, we were comparing the cost of first-considered plan
"t11,t12,t21,t22" (with materialization scan for (t11,t12)) with the
cost of partial plan "t11,t12,t22" (inner joins). Yes, inner joins,
because when we considered that partial plan, cur_dups_producing_tables
was 0 (had not been reset, which is the bug): this said that no
semijoin strategy was needed so inner join was used.
"t11,t12,t22" was then pruned by the heuristic of
optimizer_prune_level=1, because it had a higher record count.
With the fix, when considering "t11,t12,t22" we correctly realize that
this needs a semijoin strategy, choose materialization scan, and go
down to "t11,t12,t22,t21", which is less costly than
"t11,t12,t21,t22" (11.986 vs 12). Hence the plan change.
@ mysql-test/r/subselect4.result
before the code fix, the final select would return two rows (1 and 1).
@ mysql-test/r/subselect_sj2.result
Before the code fix, EXPLAIN showed a wrong plan: adding two Australian towns to table t1, each of them
having a population greater than 5MB, led the SELECT to return Australia twice (duplicate);
after this fix, Australia would be returned once.
@ mysql-test/t/subselect4.test
test for bug
@ sql/sql_select.cc
In the bug's scenario, t3 and t4 are inner tables of the semijoin. t4 is pulled out
(functionally dependent), but t3 is not. As we thus have one inner table left,
we need a semijoin strategy; without semijoin strategy, the semijoin just becomes
a plain join, leading to duplicate rows. Why didn't we have a semijoin strategy:
- plan search would first choose this order: t4,t3,t1,t2, with duplicate weedout
as semijoin strategy. As this handles the semijoin fully, join->cur_dups_producing_tables
is 0 (as a result of choosing duplicates weedout in advance_sj_state() for plan's last
table t2).
- then plan search evaluates plan t4,t3,t2,t1: it first computes its cost, then in
avdance_sj_state(), as join->cur_dups_producing_tables is still 0 (the bug),
it believes that semijoin inner tables have been handled already, so doesn't feel
obliged to choose a semijoin strategy, so this plan has no strategy, and as its
cost is lower than that of t4,t3,t1,t2, it is chosen. A plan without a strategy,
thus duplicates.
The fix: move JOIN::cur_dups_producing_tables to POSITION (which tracks the
optimization state at the partial plan's stage); as the function
comment of advance_sj_state() says, "no undo is necessary" then. When
we are about to look at the first table, we don't have duplicate-producing
tables yet (=0); when we go to table N+1, we inherit the same duplicate-
producing tables (=pos[-1].etc), and we add more tables if we hit a
semijoin nest, and we remove tables when we pick a semijoin strategy.
@ sql/sql_select.h
move JOIN::cur_dups_producing_tables to POSITION

Pushed into 6.0.14-alpha (revid:alik@sun.com-20100520100225-oe4iuu5kuzsx0knq) (version source revid:alik@sun.com-20100520100057-rmn5y3o3ij726bm7) (merge vers: 6.0.14-alpha) (pib:16)

this has reached 6.0-codebase as well as the next-mr-opt-backporting tree.

"For some queries having subqueries in the WHERE clause (semijoin), the semijoin was wrongly transformed into an inner join leading to excess rows in the result"

Noted in 6.0.14 changelog.

Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)

Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)

Bug does not appear in any released 5.6.x version. No 5.6.1 changelog entry needed.