Bug #49845 Loosescan reports different result than other semijoin methods
Submitted: 21 Dec 2009 10:46 Modified: 22 Nov 2010 0:34
Reporter: Roy Lyseng Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:6.0 OS:Any
Assigned to: Øystein Grøvlen CPU Architecture:Any
Tags: LooseScan, optimizer_switch, semijoin, subquery

[21 Dec 2009 10:46] Roy Lyseng
Description:
A query from subselect3.test that was supposed to test LooseScan method only tested that plan was correct, but not that the result was correct.

There may be other tests that only run EXPLAIN as well, if results are not tested elsewhere, query results should be included.

How to repeat:
create table t0 (a int);
insert into t0 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table t1 (kp1 int, kp2 int, c int, filler char(100), key(kp1, kp2));
insert into t1 select A.a+10*(B.a+10*C.a), 0, 0, 'filler' from t0 A, t0 B, t0 C;
insert into t1 select * from t1 where kp1 < 20;
create table t3 (a int);
insert into t3 select A.a + 10*B.a from t0 A, t0 B;
create table t4 (pk int primary key);
insert into t4 select a from t3;

set optimizer_switch='default';
explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);
select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);

set optimizer_switch='loosescan=off';
explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);
select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);

set optimizer_switch='loosescan=off,materialization=off';
explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);
select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);

set optimizer_switch='loosescan=off,materialization=off,firstmatch=off';
explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);
select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);

drop table t1, t3, t4;
[21 Dec 2009 11:01] Valeriy Kravchuk
Verified just as described:

openxs@suse:/home2/openxs/dbs/6.0-code> bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 6.0.14-alpha-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create table t0 (a int);
Query OK, 0 rows affected (0.07 sec)

mysql> insert into t0 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
Query OK, 10 rows affected (0.01 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> create table t1 (kp1 int, kp2 int, c int, filler char(100), key(kp1, kp2));
Query OK, 0 rows affected (0.03 sec)

mysql> insert into t1 select A.a+10*(B.a+10*C.a), 0, 0, 'filler' from t0 A, t0 B, t0 C;
Query OK, 1000 rows affected (0.04 sec)
Records: 1000  Duplicates: 0  Warnings: 0

mysql> insert into t1 select * from t1 where kp1 < 20;
Query OK, 20 rows affected (0.03 sec)
Records: 20  Duplicates: 0  Warnings: 0

mysql> create table t3 (a int);
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t3 select A.a + 10*B.a from t0 A, t0 B;
Query OK, 100 rows affected (0.00 sec)
Records: 100  Duplicates: 0  Warnings: 0

mysql> create table t4 (pk int primary key);
Query OK, 0 rows affected (0.02 sec)

mysql> insert into t4 select a from t3;
Query OK, 100 rows affected (0.00 sec)
Records: 100  Duplicates: 0  Warnings: 0

mysql>
mysql> set optimizer_switch='default';
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and
    -> t4.pk=t1.c);
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+---------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref       | rows | Extra                                       |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+---------------------------------------------+
|  1 | PRIMARY     | t1    | range  | kp1           | kp1     | 5       | NULL      |   48 | Using index condition; Using MRR; LooseScan |
|  1 | PRIMARY     | t4    | eq_ref | PRIMARY       | PRIMARY | 4       | test.t1.c |    1 | Using index; FirstMatch(t1)                 |
|  1 | PRIMARY     | t3    | ALL    | NULL          | NULL    | NULL    | NULL      |  100 | Using where; Using join buffer              |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+---------------------------------------------+
3 rows in set (0.00 sec)

mysql> select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);
+------+
| a    |
+------+
|    0 |
|    0 |
|    1 |
|    1 |
|    2 |
|    2 |
|    3 |
|    3 |
|    4 |
|    4 |
|    5 |
|    5 |
|    6 |
|    6 |
|    7 |
|    7 |
|    8 |
|    8 |
|    9 |
|    9 |
|   10 |
|   10 |
|   11 |
|   11 |
|   12 |
|   12 |
|   13 |
|   13 |
|   14 |
|   14 |
|   15 |
|   15 |
|   16 |
|   16 |
|   17 |
|   17 |
|   18 |
|   18 |
|   19 |
|   19 |
+------+
40 rows in set (0.01 sec)

mysql>
mysql> set optimizer_switch='loosescan=off';
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and
    -> t4.pk=t1.c);
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref       | rows | Extra                                               |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------------------------------+
|  1 | PRIMARY     | t3    | ALL    | NULL          | NULL    | NULL    | NULL      |  100 | Using where                                         |
|  1 | PRIMARY     | t1    | range  | kp1           | kp1     | 5       | NULL      |   48 | Using index condition; Using MRR; Start materialize |
|  1 | PRIMARY     | t4    | eq_ref | PRIMARY       | PRIMARY | 4       | test.t1.c |    1 | Using index; End materialize                        |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------------------------------+
3 rows in set (0.00 sec)

mysql> select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);
+------+
| a    |
+------+
|    0 |
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
|    7 |
|    8 |
|    9 |
|   10 |
|   11 |
|   12 |
|   13 |
|   14 |
|   15 |
|   16 |
|   17 |
|   18 |
|   19 |
+------+
20 rows in set (0.01 sec)

mysql>
mysql> set optimizer_switch='loosescan=off,materialization=off';
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and
    -> t4.pk=t1.c);
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref       | rows | Extra                       |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------+
|  1 | PRIMARY     | t3    | ALL    | NULL          | NULL    | NULL    | NULL      |  100 | Using where                 |
|  1 | PRIMARY     | t1    | ref    | kp1           | kp1     | 5       | test.t3.a |    1 |                             |
|  1 | PRIMARY     | t4    | eq_ref | PRIMARY       | PRIMARY | 4       | test.t1.c |    1 | Using index; FirstMatch(t3) |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------+
3 rows in set (0.00 sec)

mysql> select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);
+------+
| a    |
+------+
|    0 |
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
|    7 |
|    8 |
|    9 |
|   10 |
|   11 |
|   12 |
|   13 |
|   14 |
|   15 |
|   16 |
|   17 |
|   18 |
|   19 |
+------+
20 rows in set (0.01 sec)

mysql>
mysql> set optimizer_switch='loosescan=off,materialization=off,firstmatch=off';
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and
    -> t4.pk=t1.c);
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref       | rows | Extra                      |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------------------------+
|  1 | PRIMARY     | t3    | ALL    | NULL          | NULL    | NULL    | NULL      |  100 | Using where                |
|  1 | PRIMARY     | t1    | ref    | kp1           | kp1     | 5       | test.t3.a |    1 | Start temporary            |
|  1 | PRIMARY     | t4    | eq_ref | PRIMARY       | PRIMARY | 4       | test.t1.c |    1 | Using index; End temporary |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------------------------+
3 rows in set (0.00 sec)

mysql> select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c);
+------+
| a    |
+------+
|    0 |
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
|    7 |
|    8 |
|    9 |
|   10 |
|   11 |
|   12 |
|   13 |
|   14 |
|   15 |
|   16 |
|   17 |
|   18 |
|   19 |
+------+
20 rows in set (0.00 sec)
[16 Mar 2010 9:45] Øystein Grøvlen
Minimal test case below. Note: Order of inserts into t1 is significant.

create table t1 (kp int, c int, key(kp));
insert into t1 values (0,0),(1,0),(0,0);
create table t3 (a int);
insert into t3 values (0),(1),(2);
create table t4 (pk int primary key);
insert into t4 values (0);

explain select * from t3 where a in (select t1.kp from t1,t4 where kp<2 and
t4.pk=t1.c);
select * from t3 where a in (select t1.kp from t1,t4 where kp<2 and t4.pk=t1.c);

drop table t1, t3, t4;
[16 Mar 2010 9:47] Øystein Grøvlen
Query execution plan for above test case:

id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY t4      system  PRIMARY NULL    NULL    NULL    1
1       PRIMARY t1      range   kp      kp      5       NULL    2       Using index condition; Using where; Using MRR; LooseScan
1       PRIMARY t3      ALL     NULL    NULL    NULL    NULL    3       Using where; Using join buffer
[16 Mar 2010 9:55] Øystein Grøvlen
Turns out that table t4 does not have to have a primary key for this to be
reproducable.
[16 Mar 2010 10:15] Øystein Grøvlen
Problem goes away with
  set optimizer_switch='mrr=off';
[24 Mar 2010 14:12] Øystein Grøvlen
The issue seems to be that when MRR is used, rows are not returned in sorted
order on index key.  A sorted order is needed for LooseScan to work since it
relies on that duplicates are grouped together.

If this hypothesis is correct, what is needed is a call to 
QUICK_RANGE_SELECT::need_sorted_output().
[24 Mar 2010 22:13] Øystein Grøvlen
Removing MRR tag since it is not a problem with MRR implementation, but how
MRR is used.
[25 Mar 2010 8:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/104283

3853 oystein.grovlen@sun.com	2010-03-25
      Bug#49845 Loosescan reports different result than other semijoin methods
      
      Duplicate elimination for LooseScan assumes that table is scanned in key order.
      However, MRR will by default return rows ordered by rowid (DiskSweep MRR).
      
      Fix: Tell range select to deliver a sorted output.
     @ mysql-test/r/subselect3.result
        Updated result file.
     @ mysql-test/r/subselect3_jcl6.result
        Updated result file.
     @ mysql-test/t/subselect3.test
        A test case for this bug already existed, but it was only explained, not
        executed.  Add execution of test case.
     @ sql/sql_select.cc
        For LooseScan, make sure output from index range scans are sorted by key.
[16 Apr 2010 6:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/105803

3873 oystein.grovlen@sun.com	2010-04-16
      Bug#49845 Loosescan reports different result than other semijoin methods
      
      Duplicate elimination for LooseScan assumes that table is scanned in key order.
      However, MRR will by default return rows ordered by rowid (DiskSweep MRR).
      
      Fix: Tell range select to deliver a sorted output.
     @ mysql-test/r/subselect3.result
        Updated result file.
     @ mysql-test/r/subselect3_jcl6.result
        Updated result file.
     @ mysql-test/t/subselect3.test
        A test case for this bug already existed, but it was only explained, not
        executed.  Add execution of test case.
     @ sql/sql_select.cc
        For LooseScan, make sure output from index range scans are sorted by key.
[21 Apr 2010 12:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/106254

3830 oystein.grovlen@sun.com	2010-04-21
      Bug#49845 Loosescan reports different result than other semijoin methods
      
      Duplicate elimination for LooseScan assumes that table is scanned in key order.
      However, MRR will by default return rows ordered by rowid (DiskSweep MRR).
      
      Fix: Tell range select to deliver a sorted output.
     @ mysql-test/r/subselect3.result
        Updated result file.
     @ mysql-test/r/subselect3_jcl6.result
        Updated result file.
     @ mysql-test/t/subselect3.test
        A test case for this bug already existed, but it was only explained, not
        executed.  Add execution of test case.
     @ sql/sql_select.cc
        For LooseScan, make sure output from index range scans are sorted by key.
[21 Apr 2010 13:08] Øystein Grøvlen
Pushed to mysql-6.0-codebase bugfixing with revision-id 
oystein.grovlen@sun.com-20100421125153-h0c2sh0ju3eanlnf
[27 Apr 2010 9:46] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100427094135-5s49ecp3ckson6e2) (version source revid:alik@sun.com-20100427093843-uekr85qkd7orx12t) (merge vers: 6.0.14-alpha) (pib:16)
[11 May 2010 16:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/108019

3166 oystein.grovlen@sun.com	2010-05-11
      Bug#49845 Loosescan reports different result than other semijoin methods
      
      (Backporting of oystein.grovlen@sun.com-20100421125153-h0c2sh0ju3eanlnf)
      
      Duplicate elimination for LooseScan assumes that table is scanned in key order.
      However, MRR will by default return rows ordered by rowid (DiskSweep MRR).
      
      Fix: Tell range select to deliver a sorted output.
     @ mysql-test/r/subselect3.result
        Updated result file.
     @ mysql-test/r/subselect3_jcl6.result
        Updated result file.
     @ mysql-test/t/subselect3.test
        A test case for this bug already existed, but it was only explained, not
        executed.  Add execution of test case.
     @ sql/sql_select.cc
        For LooseScan, make sure output from index range scans are sorted by key.
[12 May 2010 1:34] Paul DuBois
Noted in 6.0.14 changelog.

The LooseScan semijoin strategy could return incorrect results with
Multi-Range Read (MRR) enabled because MRR did not return rows in key
order as required for LooseScan.
[16 Aug 2010 6:38] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:18] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[22 Nov 2010 0:34] Paul DuBois
Noted in 5.6.1 changelog.
[23 Nov 2010 2:17] Paul DuBois
Correction: No 5.6.1 changelog entry. Bug does not appear in any released 5.6.x version.