Bug #53060 LooseScan semijoin strategy does not return all rows
Submitted: 22 Apr 2010 13:28 Modified: 23 Nov 2010 3:24
Reporter: Øystein Grøvlen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:6.0 OS:Any
Assigned to: Øystein Grøvlen CPU Architecture:Any
Tags: LooseScan, optimizer_switch, semijoin, subquery

[22 Apr 2010 13:28] Øystein Grøvlen
Description:
With a non-unique index on t2(i,j), the following query does not return
correct result when LooseScan strategy is used:

SELECT * FROM t1 WHERE (i) IN (SELECT i FROM t2 where j > 0);
i
1

If LooseScan is disabled, the result is correct:

SELECT * FROM t1 WHERE (i) IN (SELECT i FROM t2 where j > 0);
i
1
2

How to repeat:
CREATE TABLE t1 (i INTEGER);
INSERT INTO t1 VALUES (1), (2), (3), (4), (5);
CREATE TABLE t2 (i INTEGER, j INTEGER, KEY k(i, j));
INSERT INTO t2 VALUES (1, 0), (1, 1), (2, 0), (2, 1);

EXPLAIN
SELECT * FROM t1 WHERE (i) IN (SELECT i FROM t2 where j > 0);
SELECT * FROM t1 WHERE (i) IN (SELECT i FROM t2 where j > 0);

SET SESSION optimizer_switch='loosescan=off';

EXPLAIN
SELECT * FROM t1 WHERE (i) IN (SELECT i FROM t2 where j > 0);
SELECT * FROM t1 WHERE (i) IN (SELECT i FROM t2 where j > 0);

DROP TABLE t1, t2;
[22 Apr 2010 13:47] Øystein Grøvlen
The problem is in sub_select() where loosescan_match_tab->found_match is not reset when moving to the next key value. This means that a key value for which a match has not yest been found, may be copied into loosescan_buf.  And all later records with the same key value will then be skipped.  

With the following change, LooseScan gives a correct result:

=== modified file 'sql/sql_select.cc'
--- sql/sql_select.cc	2010-04-17 06:34:02 +0000
+++ sql/sql_select.cc	2010-04-22 13:44:01 +0000
@@ -16735,6 +16735,9 @@ sub_select(JOIN *join,JOIN_TAB *join_tab
       continue;
     }
 
+    if (join_tab->loosescan_match_tab) 
+      join_tab->loosescan_match_tab->found_match= false;
+
     if (join_tab->keep_current_rowid)
       join_tab->table->file->position(join_tab->table->record[0]);
[23 Apr 2010 11:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/106425

3834 oystein.grovlen@sun.com	2010-04-23
      BUG#53060: LooseScan semijoin strategy does not return all rows
      
      Problem: In sub_select(), loosescan_match_tab->found_match is not reset when
      moving to the next key value. This means that a key value for which a match 
      has not yet been found, may be copied into loosescan_buf.  All later records
      with the same key value will then be skipped.
      
      Solution: Set loosescan_match_tab->found_match to FALSE after a match has
      been handled.  This way, a new match is needed before a new value is copied 
      into loosescan_buf.
      
      Addtional change:  Reset skip_true when moving to a new key value.  This way,
      new key comparisons are not made until there actually has been a new match.
     @ mysql-test/r/subselect_sj.result
        Test case for Bug#53060
     @ mysql-test/r/subselect_sj_jcl6.result
        Test case for Bug#53060
     @ mysql-test/t/subselect_sj.test
        Test case for Bug#53060
     @ sql/sql_select.cc
        Reset loosescan_match_tab->found_match to FALSE after a match has
        been handled.  This ensures that a new match has been found before a new value is copied into loosescan_buf.
        
        Reset skip_true when moving to a new key value.  This way,
        new key comparisons are not made until there actually has been a new match.
        
        Moved declaration of skip_over since it is only used inside the while-loop?
[21 May 2010 10:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/108863

3888 oystein.grovlen@sun.com	2010-05-21
      BUG#53060: LooseScan semijoin strategy does not return all rows
      
      Problem: In sub_select(), loosescan_match_tab->found_match is not reset when
      moving to the next key value. This means that a key value for which a match 
      has not yet been found, may be copied into loosescan_buf.  All later records
      with the same key value will then be skipped.
      
      Solution: Reset loosescan_match_tab->found_match to FALSE when the current
      key does not match the previous key.  Now a new value is not copied and used
      for comparison until a new match has been found for the duplicate-generating
      range.
      
      Additional change:  Moved the whole LooseScan logic to evaluate_join_record
      to colocate it with similar duplicate-avoiding logic for other semijoin
      strategies.
     @ mysql-test/r/subselect_sj.result
        Test case for Bug#53060
     @ mysql-test/r/subselect_sj_jcl6.result
        Test case for Bug#53060
     @ mysql-test/t/subselect_sj.test
        Test case for Bug#53060
     @ sql/sql_select.cc
        Moved the whole LooseScan logic to evaluate_join_record
        to colocate it with similar duplicate-avoiding logic for other semijoin
        strategies.
        
        Reset loosescan_match_tab->found_match to FALSE when a new key is
        encountered.  This ensures that a until new match has been found, no
        comparison is done, and that only key values for rows where a match has been found is copied into loosescan_buf.
[9 Jun 2010 7:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/110573

3188 oystein.grovlen@sun.com	2010-06-09
      BUG#53060: LooseScan semijoin strategy does not return all rows
      
      Problem: In sub_select(), loosescan_match_tab->found_match is not reset when
      moving to the next key value. This means that a key value for which a match 
      has not yet been found, may be copied into loosescan_buf.  All later records
      with the same key value will then be skipped.
      
      Solution: Reset loosescan_match_tab->found_match to FALSE when the current
      key does not match the previous key.  Now a new value is not copied and used
      for comparison until a new match has been found for the duplicate-generating
      range.
      
      Additional change:  Moved much of this LooseScan logic into two new JOIN_TAB 
      member functions. loosescan_duplicate() checks if current row is a duplicate,
      and loosescan_copy_key() copies key of current row if necessary.
      Also moved the use of these functions to  evaluate_join_record()
      to co-locate it with similar duplicate-avoiding logic for other semijoin
      strategies.
     @ mysql-test/r/subselect_sj.result
        Test case for Bug#53060
     @ mysql-test/r/subselect_sj_jcl6.result
        Test case for Bug#53060
     @ mysql-test/t/subselect_sj.test
        Test case for Bug#53060
     @ sql/sql_select.cc
        Moved the whole LooseScan logic into two new JOIN_TAB::member functions. These are now called from evaluate_join_record() in order to
        to co-locate it with similar duplicate-avoiding logic for other semijoin
        strategies.
        
        Reset loosescan_match_tab->found_match to FALSE when a new key is
        encountered.  This ensures that a until new match has been found, no
        comparison is done, and that only key values for rows where a match
        has been found is copied into loosescan_buf.
     @ sql/sql_select.h
        Added two new JOIN_TAB member functions to be used to determine when
        to skip duplicates for LooseScan semijoin strategy.
        loosescan_duplicate() checks if current row is a duplicate,
        and loosescan_copy_key() copies key of current row if necessary.
[9 Jun 2010 11:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/110602

3188 oystein.grovlen@sun.com	2010-06-09
      BUG#53060: LooseScan semijoin strategy does not return all rows
      
      Problem: In sub_select(), loosescan_match_tab->found_match is not reset when
      moving to the next key value. This means that a key value for which a match 
      has not yet been found, may be copied into loosescan_buf.  All later records
      with the same key value will then be skipped.
      
      Solution: Reset loosescan_match_tab->found_match to FALSE when the current
      key does not match the previous key.  Now a new value is not copied and used
      for comparison until a new match has been found for the duplicate-generating
      range.
      
      Additional change:  Moved the whole LooseScan logic to evaluate_join_record
      to colocate it with similar duplicate-avoiding logic for other semijoin
      strategies.
     @ mysql-test/r/subselect_sj.result
        Test case for Bug#53060
     @ mysql-test/r/subselect_sj_jcl6.result
        Test case for Bug#53060
     @ mysql-test/t/subselect_sj.test
        Test case for Bug#53060
     @ sql/sql_select.cc
        Moved the whole LooseScan logic to evaluate_join_record
        to colocate it with similar duplicate-avoiding logic for other semijoin
        strategies.
        
        Reset loosescan_match_tab->found_match to FALSE when a new key is
        encountered.  This ensures that a until new match has been found, no
        comparison is done, and that only key values for rows where a match
        has been found is copied into loosescan_buf.
[9 Jun 2010 11:05] Øystein Grøvlen
Patch pushed to mysql-next-mr-opt-backporting with revision id oystein.grovlen@sun.com-20100609110207-3ghzyo7roslb07yr
[16 Aug 2010 6:39] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:26] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[23 Nov 2010 3:24] Paul DuBois
Bug does not appear in any released 5.6.x version. No 5.6.1 changelog entry needed.