Bug #52092 Bad result in 6.0-codebase after setting correct block_size in bug#48229
Submitted: 16 Mar 2010 13:10 Modified: 22 Nov 2010 1:20
Reporter: Mattias Jonsson Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S1 (Critical)
Version:mysql-6.0-codebase OS:Any
Assigned to: Øystein Grøvlen CPU Architecture:Any
Tags: LooseScan, optimizer_switch, semijoin, subquery

[16 Mar 2010 13:10] Mattias Jonsson
Description:
When merging bug#48229 into a mysql-6.0-codebase tree main.partition will fail with a bad result.

I will remove the fix for bug#48229 when merging to mysql-pe and rely on this bug to get it fixed in 6.0-codebase (since it works in mysql-5.1 and mysql-trunk).

How to repeat:
patch sql/ha_partition.cc:
=== modified file 'sql/ha_partition.cc'
--- sql/ha_partition.cc	2010-03-03 15:09:11 +0000
+++ sql/ha_partition.cc	2010-03-16 08:47:41 +0000
@@ -5260,6 +5260,7 @@
     file= m_file[handler_instance];
     file->info(HA_STATUS_CONST);
     stats.create_time= file->stats.create_time;
+    stats.block_size= file->stats.block_size;
     ref_length= m_ref_length;
   }
   if (flag & HA_STATUS_ERRKEY)

and run this test:
--source include/have_partition.inc

--disable_warnings
drop table if exists t1, t2, t4;
--enable_warnings

CREATE TABLE t2 (
  taken datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  id int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (id,taken),
  KEY taken (taken)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO t2 VALUES 
('2006-09-27 21:50:01',16421),
('2006-10-02 21:50:01',16421),
('2006-09-27 21:50:01',19092),
('2006-09-28 21:50:01',19092),
('2006-09-29 21:50:01',19092),
('2006-09-30 21:50:01',19092),
('2006-10-01 21:50:01',19092),
('2006-10-02 21:50:01',19092),
('2006-09-27 21:50:01',22589),
('2006-09-29 21:50:01',22589);

CREATE TABLE t1 (
  id int(8) NOT NULL,
  PRIMARY KEY (id)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO t1 VALUES 
(16421),
(19092),
(22589);

CREATE TABLE t4 (
  taken datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  id int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (id,taken),
  KEY taken (taken)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 
PARTITION BY RANGE (to_days(taken)) 
(
PARTITION p01 VALUES LESS THAN (732920) , 
PARTITION p02 VALUES LESS THAN (732950) , 
PARTITION p03 VALUES LESS THAN MAXVALUE ) ;

INSERT INTO t4 select * from t2;

set @f_date='2006-09-28';
set @t_date='2006-10-02';

--echo # This will produce wrong result!!!
SELECT t1.id AS MyISAM_part
FROM t1
WHERE t1.id IN (
    SELECT distinct id
    FROM t4
    WHERE taken BETWEEN @f_date AND date_add(@t_date, INTERVAL 1 DAY))
ORDER BY t1.id
;
EXPLAIN SELECT t1.id AS MyISAM_part
FROM t1
WHERE t1.id IN (
    SELECT distinct id
    FROM t4
    WHERE taken BETWEEN @f_date AND date_add(@t_date, INTERVAL 1 DAY))
ORDER BY t1.id
;
SELECT distinct id FROM t4 WHERE taken BETWEEN @f_date AND date_add(@t_date, INTERVAL 1 DAY);
--echo # This is the correct (expected) result.
SELECT t1.id AS MyISAM_part FROM t1 WHERE t1.id IN (16421, 19092, 22589)
ORDER BY t1.id;

drop table t1, t2, t4;

Suggested fix:
Since it changes the index_only_read_time cost I guess the bug is in the optimizer.
[16 Mar 2010 17:36] Valeriy Kravchuk
Thank you for the bug report. Verified just as described on Mac OS X.
[24 Mar 2010 8:18] Valeriy Kravchuk
For those wondering what kind of wrong results we get with that patch, this is what mtr shows:

openxs@ubuntu:/home2/openxs/bzr/mysql-6.0-codebase/mysql-test$ ./mtr partition
Logging: ./mtr  partition
100324 10:16:08 [Note] Buffered information: Performance schema disabled (reason: start parameters).

100324 10:16:09 [Note] Plugin 'FEDERATED' is disabled.
100324 10:16:09 [Note] Plugin 'ndbcluster' is disabled.
MySQL Version 6.0.14
Checking supported features...
 - using ndbcluster when necessary, mysqld supports it
 - SSL connections supported
 - binaries are debug compiled
Collecting tests...
vardir: /home2/openxs/bzr/mysql-6.0-codebase/mysql-test/var
Removing old var directory...
Creating var directory '/home2/openxs/bzr/mysql-6.0-codebase/mysql-test/var'...
Installing system database...
Using server port 43156
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009

==============================================================================

TEST                                      RESULT   TIME (ms)
------------------------------------------------------------

main.partition                           [ fail ]
        Test ended at 2010-03-24 10:16:19

CURRENT_TEST: main.partition
--- /home2/openxs/bzr/mysql-6.0-codebase/mysql-test/r/partition.result	2010-02-21 23:32:01.000000000 +0300
+++ /home2/openxs/bzr/mysql-6.0-codebase/mysql-test/r/partition.reject	2010-03-24 11:16:19.000000000 +0300
@@ -1695,7 +1695,7 @@
 MyISAM_part
 16421
 19092
-22589
+19092
 drop table t1, t2, t4;
 CREATE TABLE t1 (
 taken datetime NOT NULL DEFAULT '0000-00-00 00:00:00',

mysqltest: Result content mismatch
[24 Mar 2010 9:38] Valeriy Kravchuk
Now, let me show it step by step:

openxs@ubuntu:/home2/openxs/dbs/6.0-codebase$ bin/mysql --no-defaults -uroot test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 6.0.14-alpha-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> drop table if exists t1, t2, t4;
Query OK, 0 rows affected, 3 warnings (0.00 sec)

mysql> CREATE TABLE t2 (
    ->   taken datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
    ->   id int(11) NOT NULL DEFAULT '0',
    ->   PRIMARY KEY (id,taken),
    ->   KEY taken (taken)
    -> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.03 sec)

mysql> 
mysql> INSERT INTO t2 VALUES 
    -> ('2006-09-27 21:50:01',16421),
    -> ('2006-10-02 21:50:01',16421),
    -> ('2006-09-27 21:50:01',19092),
    -> ('2006-09-28 21:50:01',19092),
    -> ('2006-09-29 21:50:01',19092),
    -> ('2006-09-30 21:50:01',19092),
    -> ('2006-10-01 21:50:01',19092),
    -> ('2006-10-02 21:50:01',19092),
    -> ('2006-09-27 21:50:01',22589),
    -> ('2006-09-29 21:50:01',22589);
Query OK, 10 rows affected (0.00 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> 
mysql> CREATE TABLE t1 (
    ->   id int(8) NOT NULL,
    ->   PRIMARY KEY (id)
    -> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.00 sec)

mysql> 
mysql> INSERT INTO t1 VALUES 
    -> (16421),
    -> (19092),
    -> (22589);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> 
mysql> CREATE TABLE t4 (
    ->   taken datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
    ->   id int(11) NOT NULL DEFAULT '0',
    ->   PRIMARY KEY (id,taken),
    ->   KEY taken (taken)
    -> ) ENGINE=MyISAM DEFAULT CHARSET=latin1 
    -> PARTITION BY RANGE (to_days(taken)) 
    -> (
    -> PARTITION p01 VALUES LESS THAN (732920) , 
    -> PARTITION p02 VALUES LESS THAN (732950) , 
    -> PARTITION p03 VALUES LESS THAN MAXVALUE ) ;
Query OK, 0 rows affected (0.00 sec)

mysql> 
mysql> INSERT INTO t4 select * from t2;
Query OK, 10 rows affected (0.00 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> set @f_date='2006-09-28';
Query OK, 0 rows affected (0.00 sec)

mysql> set @t_date='2006-10-02';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT t1.id AS MyISAM_part
    -> FROM t1
    -> WHERE t1.id IN (
    ->     SELECT distinct id
    ->     FROM t4
    ->     WHERE taken BETWEEN @f_date AND date_add(@t_date, INTERVAL 1 DAY))
    -> ORDER BY t1.id
    -> ;
+-------------+
| MyISAM_part |
+-------------+
|       16421 |
|       19092 |
|       19092 |
+-------------+
3 rows in set (0.00 sec)

The results above (third row with value 19092 again instead of 22589) is wrong and represents regression that is introduced by the patch. This is how optimizer processes the query:

mysql> EXPLAIN SELECT t1.id AS MyISAM_part
    -> FROM t1
    -> WHERE t1.id IN (
    ->     SELECT distinct id
    ->     FROM t4
    ->     WHERE taken BETWEEN @f_date AND date_add(@t_date, INTERVAL 1 DAY))
    -> ORDER BY t1.id
    -> ;
+----+-------------+-------+--------+---------------+---------+---------+------------+------+----------------------------------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref        | rows | Extra                                                                |
+----+-------------+-------+--------+---------------+---------+---------+------------+------+----------------------------------------------------------------------+
|  1 | PRIMARY     | t4    | index  | PRIMARY       | PRIMARY | 12      | NULL       |   10 | Using where; Using index; Using temporary; Using filesort; LooseScan |
|  1 | PRIMARY     | t1    | eq_ref | PRIMARY       | PRIMARY | 4       | test.t4.id |    1 | Using index                                                          |
+----+-------------+-------+--------+---------------+---------+---------+------------+------+----------------------------------------------------------------------+
2 rows in set (0.00 sec)

Now, let's prove that result above is wrong (that is, it is not a wrong test only):

mysql> SELECT distinct id FROM t4 WHERE taken BETWEEN @f_date AND date_add(@t_date, INTERVAL 1
    -> DAY);
+-------+
| id    |
+-------+
| 16421 |
| 19092 |
| 22589 |
+-------+
3 rows in set (0.00 sec)

mysql> SELECT t1.id AS MyISAM_part FROM t1 WHERE t1.id IN (16421, 19092, 22589)
    -> ORDER BY t1.id;
+-------------+
| MyISAM_part |
+-------------+
|       16421 |
|       19092 |
|       22589 |
+-------------+
3 rows in set (0.01 sec)

This is what we can get from the problematic query with 5.1.43 or any mysql-6.0-codebase without patch.
[25 Mar 2010 8:03] Øystein Grøvlen
The patch results in a change in plan.  Before the change, FirstMatch strategy is used:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	PRIMARY	t1	index	PRIMARY	PRIMARY	4	NULL	3	Using index
1	PRIMARY	t4	ref	PRIMARY,taken	PRIMARY	4	test.t1.id	1	Using where; Using index; FirstMatch(t1)

With the change, LooseScan strategy is used:
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	PRIMARY	t4	index	PRIMARY,taken	PRIMARY	12	NULL	10	Using where; Using index; Using temporary; Using filesort; LooseScan
1	PRIMARY	t1	eq_ref	PRIMARY	PRIMARY	4	test.t4.id	1	Using index
[25 Mar 2010 8:32] Øystein Grøvlen
If LooseScan strategy is turned off, the query will give correct result.
[26 Mar 2010 8:29] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100326081944-qja07qklw1p2w7jb) (version source revid:alik@sun.com-20100325073410-4t4i9gu2u1pge7xb) (merge vers: 6.0.14-alpha) (pib:16)
[22 Apr 2010 7:01] Øystein Grøvlen
The reason LooseScan fails, is that the partition engine does not deliver the rows in sorted index order.  Either, it must be possible to tell the partition engine maintain sorted order, or this must be detected so that LooseScan can not be used in this scenario.
[22 Apr 2010 12:40] Øystein Grøvlen
There are two issues present here:

1. Since a temporary table will be used, the requirement for ordered index access
   is turned off.  (Since sorting will have to be done later, anyway).  However,
   LooseScan strategy needs ordered index access in order to work correctly. 
   Hence, sorted requirement needs to be kept when LooseScan is to be used.

2. There is a bug in sub_select() so that loosescan_match_tab->found_match is not
   reset when moving to the next key.

The following changes, makes our test case return a correct result (However, this gives some plan changes for partition tests that needs to be further inspected.):

=== modified file 'sql/sql_select.cc'
--- sql/sql_select.cc	2010-04-21 12:51:53 +0000
+++ sql/sql_select.cc	2010-04-22 12:37:03 +0000
@@ -2621,7 +2621,10 @@ JOIN::exec()
     thd_proc_info(thd, "Copying to tmp table");
     DBUG_PRINT("info", ("%s", thd->proc_info));
     if (!curr_join->sort_and_group &&
-        curr_join->const_tables != curr_join->tables)
+        curr_join->const_tables != curr_join->tables && 
+        curr_join->best_positions[curr_join->const_tables].sj_strategy 
+          != SJ_OPT_LOOSE_SCAN
+        )
       curr_join->join_tab[curr_join->const_tables].sorted= 0;
     if ((tmp_error= do_select(curr_join, (List<Item> *) 0, curr_tmp_table, 0)))
     {
@@ -16747,6 +16750,9 @@ sub_select(JOIN *join,JOIN_TAB *join_tab
       continue;
     }
 
+    if (join_tab->loosescan_match_tab) 
+      join_tab->loosescan_match_tab->found_match= false;
+
     if (join_tab->keep_current_rowid)
       join_tab->table->file->position(join_tab->table->record[0]);
[22 Apr 2010 13:45] Øystein Grøvlen
Issue 2 above has been reported as a separate bug and will be handled in Bug#53060.
[18 May 2010 12:51] Guilhem Bichot
now this bug exists in next-mr-opt-backporting, we see the failure of partition.test when using the test from 6.0-codebase.
[18 May 2010 19:15] Guilhem Bichot
The fix for BUG#48229 (== the patch posted by Mattias when filing this bug report) also causes a change in the access method and "rows" columns of EXPLAIN when executing test partition_pruning.test. So this is happening now in next-mr-opt-backporting. When this fix is undone from next-mr-opt-backporting, one gets the same result as 6.0-codebase.
The change looks bad as it leads to an increased "rows" count (more rows examined => slower).
[18 May 2010 19:16] Guilhem Bichot
example: first diff for partition_pruning.test:
@@ -270,7 +270,7 @@
 8
 EXPLAIN PARTITIONS SELECT * FROM t1 WHERE a >= 7;
 id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	Extra
-1	SIMPLE	t1	max	range	PRIMARY	PRIMARY	4	NULL	2	Using where; Using index
+1	SIMPLE	t1	max	index	PRIMARY	PRIMARY	4	NULL	10	Using where; Using index
 SELECT * FROM t1 WHERE a > 1;
 a
 2
[2 Jun 2010 14:31] Øystein Grøvlen
Bug#53637 is a duplicate of this
[11 Jun 2010 9:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/110800

3191 oystein.grovlen@sun.com	2010-06-11
      Bug#52092 Bad result in 6.0-codebase after setting correct block_size ...
      
      Problem: LooseScan strategy fails because the partition engine does not
      deliver the rows in sorted index order. This happens because when a 
      a temporary table is used, the requirement for ordered index access
      is turned off.  (Since sorting will have to be done later, anyway).  
      However, LooseScan strategy needs ordered index access in order to work
      correctly. 
      
      Solution: Keep requirement for sorted access when LooseScan is to be used.
     @ mysql-test/r/partition.result
        Updated result file.
     @ mysql-test/t/partition.test
        With this fix, semijoin does not have to be turned off for this test case
        to give correct results.  The test case is the same as reported in the 
        bug report for this issue,
     @ sql/sql_select.cc
        Keep requirement for sorted access to base table when LooseScan strategy is to be used.
[17 Jun 2010 7:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/111356

3198 oystein.grovlen@sun.com	2010-06-17
      Bug#52092 Bad result in 6.0-codebase after setting correct block_size ...
      
      Problem: LooseScan strategy fails because the partition engine does not
      deliver the rows in sorted index order. This happens because when a 
      a temporary table is used, the requirement for ordered index access
      is turned off.  (Since sorting will have to be done later, anyway).  
      However, LooseScan strategy needs ordered index access in order to work
      correctly. 
      
      Solution: Keep requirement for sorted access when LooseScan is to be used.
     @ mysql-test/r/partition.result
        Updated result file.
     @ mysql-test/t/partition.test
        With this fix, semijoin does not have to be turned off for this test case
        to give correct results.  The test case is the same as reported in the 
        bug report for this issue,
     @ sql/sql_select.cc
        Keep requirement for sorted access to base table when LooseScan strategy is to be used.
[17 Jun 2010 7:06] Øystein Grøvlen
Patch pushed into mysql-next-mr-opt-backporting with revision-id oystein.grovlen@sun.com-20100617070120-ivl7prwzvcaicwip
[16 Aug 2010 6:35] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:05] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[22 Nov 2010 1:20] Paul DuBois
Bug is not in any released 5.6.x version. No changelog entry needed.