Bug #49845 | Loosescan reports different result than other semijoin methods | ||
---|---|---|---|
Submitted: | 21 Dec 2009 10:46 | Modified: | 22 Nov 2010 0:34 |
Reporter: | Roy Lyseng | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
Version: | 6.0 | OS: | Any |
Assigned to: | Øystein Grøvlen | CPU Architecture: | Any |
Tags: | LooseScan, optimizer_switch, semijoin, subquery |
[21 Dec 2009 10:46]
Roy Lyseng
[21 Dec 2009 11:01]
Valeriy Kravchuk
Verified just as described: openxs@suse:/home2/openxs/dbs/6.0-code> bin/mysql -uroot test Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 6.0.14-alpha-debug Source distribution Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create table t0 (a int); Query OK, 0 rows affected (0.07 sec) mysql> insert into t0 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); Query OK, 10 rows affected (0.01 sec) Records: 10 Duplicates: 0 Warnings: 0 mysql> create table t1 (kp1 int, kp2 int, c int, filler char(100), key(kp1, kp2)); Query OK, 0 rows affected (0.03 sec) mysql> insert into t1 select A.a+10*(B.a+10*C.a), 0, 0, 'filler' from t0 A, t0 B, t0 C; Query OK, 1000 rows affected (0.04 sec) Records: 1000 Duplicates: 0 Warnings: 0 mysql> insert into t1 select * from t1 where kp1 < 20; Query OK, 20 rows affected (0.03 sec) Records: 20 Duplicates: 0 Warnings: 0 mysql> create table t3 (a int); Query OK, 0 rows affected (0.01 sec) mysql> insert into t3 select A.a + 10*B.a from t0 A, t0 B; Query OK, 100 rows affected (0.00 sec) Records: 100 Duplicates: 0 Warnings: 0 mysql> create table t4 (pk int primary key); Query OK, 0 rows affected (0.02 sec) mysql> insert into t4 select a from t3; Query OK, 100 rows affected (0.00 sec) Records: 100 Duplicates: 0 Warnings: 0 mysql> mysql> set optimizer_switch='default'; Query OK, 0 rows affected (0.00 sec) mysql> explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and -> t4.pk=t1.c); +----+-------------+-------+--------+---------------+---------+---------+-----------+------+---------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+-----------+------+---------------------------------------------+ | 1 | PRIMARY | t1 | range | kp1 | kp1 | 5 | NULL | 48 | Using index condition; Using MRR; LooseScan | | 1 | PRIMARY | t4 | eq_ref | PRIMARY | PRIMARY | 4 | test.t1.c | 1 | Using index; FirstMatch(t1) | | 1 | PRIMARY | t3 | ALL | NULL | NULL | NULL | NULL | 100 | Using where; Using join buffer | +----+-------------+-------+--------+---------------+---------+---------+-----------+------+---------------------------------------------+ 3 rows in set (0.00 sec) mysql> select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c); +------+ | a | +------+ | 0 | | 0 | | 1 | | 1 | | 2 | | 2 | | 3 | | 3 | | 4 | | 4 | | 5 | | 5 | | 6 | | 6 | | 7 | | 7 | | 8 | | 8 | | 9 | | 9 | | 10 | | 10 | | 11 | | 11 | | 12 | | 12 | | 13 | | 13 | | 14 | | 14 | | 15 | | 15 | | 16 | | 16 | | 17 | | 17 | | 18 | | 18 | | 19 | | 19 | +------+ 40 rows in set (0.01 sec) mysql> mysql> set optimizer_switch='loosescan=off'; Query OK, 0 rows affected (0.00 sec) mysql> explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and -> t4.pk=t1.c); +----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------------------------------+ | 1 | PRIMARY | t3 | ALL | NULL | NULL | NULL | NULL | 100 | Using where | | 1 | PRIMARY | t1 | range | kp1 | kp1 | 5 | NULL | 48 | Using index condition; Using MRR; Start materialize | | 1 | PRIMARY | t4 | eq_ref | PRIMARY | PRIMARY | 4 | test.t1.c | 1 | Using index; End materialize | +----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------------------------------+ 3 rows in set (0.00 sec) mysql> select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c); +------+ | a | +------+ | 0 | | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 16 | | 17 | | 18 | | 19 | +------+ 20 rows in set (0.01 sec) mysql> mysql> set optimizer_switch='loosescan=off,materialization=off'; Query OK, 0 rows affected (0.00 sec) mysql> explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and -> t4.pk=t1.c); +----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------+ | 1 | PRIMARY | t3 | ALL | NULL | NULL | NULL | NULL | 100 | Using where | | 1 | PRIMARY | t1 | ref | kp1 | kp1 | 5 | test.t3.a | 1 | | | 1 | PRIMARY | t4 | eq_ref | PRIMARY | PRIMARY | 4 | test.t1.c | 1 | Using index; FirstMatch(t3) | +----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------+ 3 rows in set (0.00 sec) mysql> select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c); +------+ | a | +------+ | 0 | | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 16 | | 17 | | 18 | | 19 | +------+ 20 rows in set (0.01 sec) mysql> mysql> set optimizer_switch='loosescan=off,materialization=off,firstmatch=off'; Query OK, 0 rows affected (0.00 sec) mysql> explain select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and -> t4.pk=t1.c); +----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------------------------+ | 1 | PRIMARY | t3 | ALL | NULL | NULL | NULL | NULL | 100 | Using where | | 1 | PRIMARY | t1 | ref | kp1 | kp1 | 5 | test.t3.a | 1 | Start temporary | | 1 | PRIMARY | t4 | eq_ref | PRIMARY | PRIMARY | 4 | test.t1.c | 1 | Using index; End temporary | +----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------------------------+ 3 rows in set (0.00 sec) mysql> select * from t3 where a in (select t1.kp1 from t1,t4 where kp1<20 and t4.pk=t1.c); +------+ | a | +------+ | 0 | | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 16 | | 17 | | 18 | | 19 | +------+ 20 rows in set (0.00 sec)
[16 Mar 2010 9:45]
Øystein Grøvlen
Minimal test case below. Note: Order of inserts into t1 is significant. create table t1 (kp int, c int, key(kp)); insert into t1 values (0,0),(1,0),(0,0); create table t3 (a int); insert into t3 values (0),(1),(2); create table t4 (pk int primary key); insert into t4 values (0); explain select * from t3 where a in (select t1.kp from t1,t4 where kp<2 and t4.pk=t1.c); select * from t3 where a in (select t1.kp from t1,t4 where kp<2 and t4.pk=t1.c); drop table t1, t3, t4;
[16 Mar 2010 9:47]
Øystein Grøvlen
Query execution plan for above test case: id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t4 system PRIMARY NULL NULL NULL 1 1 PRIMARY t1 range kp kp 5 NULL 2 Using index condition; Using where; Using MRR; LooseScan 1 PRIMARY t3 ALL NULL NULL NULL NULL 3 Using where; Using join buffer
[16 Mar 2010 9:55]
Øystein Grøvlen
Turns out that table t4 does not have to have a primary key for this to be reproducable.
[16 Mar 2010 10:15]
Øystein Grøvlen
Problem goes away with set optimizer_switch='mrr=off';
[24 Mar 2010 14:12]
Øystein Grøvlen
The issue seems to be that when MRR is used, rows are not returned in sorted order on index key. A sorted order is needed for LooseScan to work since it relies on that duplicates are grouped together. If this hypothesis is correct, what is needed is a call to QUICK_RANGE_SELECT::need_sorted_output().
[24 Mar 2010 22:13]
Øystein Grøvlen
Removing MRR tag since it is not a problem with MRR implementation, but how MRR is used.
[25 Mar 2010 8:21]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/104283 3853 oystein.grovlen@sun.com 2010-03-25 Bug#49845 Loosescan reports different result than other semijoin methods Duplicate elimination for LooseScan assumes that table is scanned in key order. However, MRR will by default return rows ordered by rowid (DiskSweep MRR). Fix: Tell range select to deliver a sorted output. @ mysql-test/r/subselect3.result Updated result file. @ mysql-test/r/subselect3_jcl6.result Updated result file. @ mysql-test/t/subselect3.test A test case for this bug already existed, but it was only explained, not executed. Add execution of test case. @ sql/sql_select.cc For LooseScan, make sure output from index range scans are sorted by key.
[16 Apr 2010 6:57]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/105803 3873 oystein.grovlen@sun.com 2010-04-16 Bug#49845 Loosescan reports different result than other semijoin methods Duplicate elimination for LooseScan assumes that table is scanned in key order. However, MRR will by default return rows ordered by rowid (DiskSweep MRR). Fix: Tell range select to deliver a sorted output. @ mysql-test/r/subselect3.result Updated result file. @ mysql-test/r/subselect3_jcl6.result Updated result file. @ mysql-test/t/subselect3.test A test case for this bug already existed, but it was only explained, not executed. Add execution of test case. @ sql/sql_select.cc For LooseScan, make sure output from index range scans are sorted by key.
[21 Apr 2010 12:52]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/106254 3830 oystein.grovlen@sun.com 2010-04-21 Bug#49845 Loosescan reports different result than other semijoin methods Duplicate elimination for LooseScan assumes that table is scanned in key order. However, MRR will by default return rows ordered by rowid (DiskSweep MRR). Fix: Tell range select to deliver a sorted output. @ mysql-test/r/subselect3.result Updated result file. @ mysql-test/r/subselect3_jcl6.result Updated result file. @ mysql-test/t/subselect3.test A test case for this bug already existed, but it was only explained, not executed. Add execution of test case. @ sql/sql_select.cc For LooseScan, make sure output from index range scans are sorted by key.
[21 Apr 2010 13:08]
Øystein Grøvlen
Pushed to mysql-6.0-codebase bugfixing with revision-id oystein.grovlen@sun.com-20100421125153-h0c2sh0ju3eanlnf
[27 Apr 2010 9:46]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100427094135-5s49ecp3ckson6e2) (version source revid:alik@sun.com-20100427093843-uekr85qkd7orx12t) (merge vers: 6.0.14-alpha) (pib:16)
[11 May 2010 16:57]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/108019 3166 oystein.grovlen@sun.com 2010-05-11 Bug#49845 Loosescan reports different result than other semijoin methods (Backporting of oystein.grovlen@sun.com-20100421125153-h0c2sh0ju3eanlnf) Duplicate elimination for LooseScan assumes that table is scanned in key order. However, MRR will by default return rows ordered by rowid (DiskSweep MRR). Fix: Tell range select to deliver a sorted output. @ mysql-test/r/subselect3.result Updated result file. @ mysql-test/r/subselect3_jcl6.result Updated result file. @ mysql-test/t/subselect3.test A test case for this bug already existed, but it was only explained, not executed. Add execution of test case. @ sql/sql_select.cc For LooseScan, make sure output from index range scans are sorted by key.
[12 May 2010 1:34]
Paul DuBois
Noted in 6.0.14 changelog. The LooseScan semijoin strategy could return incorrect results with Multi-Range Read (MRR) enabled because MRR did not return rows in key order as required for LooseScan.
[16 Aug 2010 6:38]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:18]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[22 Nov 2010 0:34]
Paul DuBois
Noted in 5.6.1 changelog.
[23 Nov 2010 2:17]
Paul DuBois
Correction: No 5.6.1 changelog entry. Bug does not appear in any released 5.6.x version.