Bug #30573 Ordered range scan over partitioned tables returns some rows twice
Submitted: 22 Aug 2007 17:49 Modified: 14 Mar 2008 9:43
Reporter: Ben Li Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Partitions Severity:S2 (Serious)
Version:5.1.22-beta-log Source distribution OS:Linux (maybe other os)
Assigned to: Sergey Petrunya CPU Architecture:Any
Tags: unscheduled

[22 Aug 2007 17:49] Ben Li
Description:
A little complicated to descripted it,see below
The two sqls should got same result for defid = 52, but its not true in newest version

Server version: 5.1.22-beta-log Source distribution

select sum(count) from ca_hitsdata ch where ch.defid in(52) 
and ch.counttype = 0 and ch.day between 20070320 and 20070401;
|      86854 |

select sum(count) from ca_hitsdata ch where ch.defid in(50,52) and ch.counttype = 0 and ch.day between 20070320 and 20070401 group by defid;
|      87220 |    52 |

CREATE TABLE `ca_hitsdata` (
  `defid` int(10) unsigned NOT NULL,
  `count` int(10) unsigned NOT NULL,
  `counttype` tinyint(4) NOT NULL DEFAULT '0',
  `day` int(10) unsigned NOT NULL,
  `hour` tinyint(3) unsigned NOT NULL,
  `domid` smallint(5) unsigned NOT NULL,
  PRIMARY KEY (`defid`,`day`,`counttype`,`hour`),
  KEY `i_dom_day` (`domid`,`day`)
) ENGINE=MyISAM DEFAULT CHARSET=gbk /*!50100 PARTITION BY RANGE (day) (PARTITION p0 VALUES LESS THAN (20060901) ENGINE = MyISAM, PARTITION p1 VALUES LESS THAN (20061001) ENGINE = MyISAM, PARTITION p2 VALUES LESS THAN (20061101) ENGINE = MyISAM, PARTITION p3 VALUES LESS THAN (20061201) ENGINE = MyISAM, PARTITION p4 VALUES LESS THAN (20070101) ENGINE = MyISAM, PARTITION p5 VALUES LESS THAN (20070201) ENGINE = MyISAM, PARTITION p6 VALUES LESS THAN (20070301) ENGINE = MyISAM, PARTITION p7 VALUES LESS THAN (20070401) ENGINE = MyISAM, PARTITION p8 VALUES LESS THAN (20070501) ENGINE = MyISAM, PARTITION p9 VALUES LESS THAN (20070601) ENGINE = MyISAM, PARTITION p10 VALUES LESS THAN (20070701) ENGINE = MyISAM, PARTITION p12 VALUES LESS THAN (20070801) ENGINE = MyISAM, PARTITION p13 VALUES LESS THAN (20070901) ENGINE = MyISAM, PARTITION p14 VALUES LESS THAN (20071001) ENGINE = MyISAM, PARTITION p15 VALUES LESS THAN (20071101) ENGINE = MyISAM, PARTITION p16 VALUES LESS THAN (20071201) ENGINE = MyISAM, PARTITION p17 VALUES LESS THAN (20080101) ENGINE = MyISAM, PARTITION p18 VALUES LESS THAN (20080201) ENGINE = MyISAM, PARTITION p19 VALUES LESS THAN (20080301) ENGINE = MyISAM, PARTITION p20 VALUES LESS THAN (20080401) ENGINE = MyISAM, PARTITION p21 VALUES LESS THAN (20080501) ENGINE = MyISAM, PARTITION p22 VALUES LESS THAN (20080601) ENGINE = MyISAM, PARTITION p23 VALUES LESS THAN (20080701) ENGINE = MyISAM, PARTITION p24 VALUES LESS THAN MAXVALUE ENGINE = MyISAM) */ |

How to repeat:
I'm sorry, it's need to download a file(size is 2M) to repeat this bug, but I've try my best to reduce the datasize. with fewer data, even with fewer PARTITIONs, this bug is dispear.

wget http://210.51.190.238/web/testbug.sql.bz2
and 
bzcat http://210.51.190.238/web/testbug.sql.bz2 | mysql test
; if you trust me, or you can look at testbug.sql.bz2, I've just do a 
mysqldump test ca_hitsdata
[22 Aug 2007 17:55] Ben Li
I've also upload a datafile to ftp, filename is bug-data-30573.zip.
[22 Aug 2007 18:15] MySQL Verification Team
Thank you for the bug report.

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.22-beta-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> source /home/miguel/dbs/5.1/testbug.sql
Query OK, 0 rows affected (0.40 sec)

<cut>

Query OK, 0 rows affected (0.00 sec)

mysql> select sum(count) from ca_hitsdata ch where ch.defid in(52) 
    -> and ch.counttype = 0 and ch.day between 20070320 and 20070401;
+------------+
| sum(count) |
+------------+
|      86854 | 
+------------+
1 row in set (0.46 sec)

mysql> select sum(count) from ca_hitsdata ch where ch.defid in(50,52) and ch.counttype = 0 and
    -> ch.day between 20070320 and 20070401 group by defid;
+------------+
| sum(count) |
+------------+
|      87220 | 
+------------+
1 row in set (0.00 sec)

mysql> create table ca_hitsdata2 as select * from ca_hitsdata;
Query OK, 728745 rows affected (3.05 sec)
Records: 728745  Duplicates: 0  Warnings: 0

mysql> select sum(count) from ca_hitsdata2 ch where ch.defid in(50,52) and ch.counttype = 0 and ch.day between 20070320 and 20070401 group by defid;
+------------+
| sum(count) |
+------------+
|      86854 | 
+------------+
1 row in set (0.76 sec)

mysql> select sum(count) from ca_hitsdata2 ch where ch.defid in(52)  and ch.counttype = 0 and ch.day between 20070320 and 20070401;
+------------+
| sum(count) |
+------------+
|      86854 | 
+------------+
1 row in set (0.72 sec)
[20 Nov 2007 1:55] Sergey Petrunya
The bug actually is not related to GROUP BY. It can be observed for every QEP which does an ordered range scan (i.e. scan that produces records in key order) over a partitioned table.
[20 Nov 2007 2:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/38107

ChangeSet@1.2633, 2007-11-20 05:02:49+03:00, sergefp@mysql.com +3 -0
  BUG#30573: Ordered range scan over partitioned tables returns some rows twice
  The problem: ha_partition::read_range_first() could return a record that is 
  outside of the scanned range. If that record happened to be in the next 
  subsequent range, it would satisfy the WHERE and appear in the output twice.
  (we would get it the second time when scanning the next subsequent range)
  
  Fix: 
  Made ha_partition::read_range_first() check if the returned recod is within
  the scanned range, like other read_range_first() implementations do.
[20 Nov 2007 12:04] Gleb Shchepa
Ok to push.
[13 Dec 2007 16:10] Sergei Glukhov
This fix causes 'partition_flacon' test failure:

 # No result if bug exists
 SELECT * FROM t1 WHERE c1 LIKE 'A%';
 c1
-A
 # But this was not found (if bug exists)
 SELECT * FROM t1 WHERE c1 = 'A';
 c1

see also Bug#30480 Falcon: searches fail if LIKE and key partition
[13 Dec 2007 17:52] Sergei Glukhov
The fix is rmoved because of failures in 'parts' test suite.
[14 Dec 2007 8:17] Bugs System
Pushed into 5.1.23-rc
[14 Dec 2007 8:20] Bugs System
Pushed into 6.0.5-alpha
[22 Dec 2007 8:12] Sergey Petrunya
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40373

ChangeSet@1.2766, 2007-12-22 10:59:24+03:00, sergefp@mysql.com +4 -0
  BUG#33257 "range access on partitioned falcon table actually scans entire indexes"
  - Make ha_partition to pass read_range_XXX() calls to partition handlers, without
converting 
    them to index_read/index_next-family calls. 
  
    This is needed for Falcon (which is terribly inefficient when it doesn't get the other

     endpoint) and possibly other storage engines with similar characteristics.
[25 Dec 2007 11:41] Ingo Strüwing
Ok to push by me.
[8 Feb 2008 13:57] Susanne Ebrecht
Bug #33555 is a duplicate of this bug here.
[13 Mar 2008 19:27] Bugs System
Pushed into 6.0.5-alpha
[14 Mar 2008 9:43] Jon Stephens
Documented in the 6.0.5 changelog as follows:

        When the range access method was used on a
        partitioned Falcon table, the entire index was
        scanned. For partitioned tables using other storage engines, the same
        issue caused an ordered range scan to return some rows twice.
[14 Mar 2008 9:44] Jon Stephens
(s/the same issue/a related issue/ in changelog entry)
[18 Sep 2008 12:18] Mattias Jonsson
This bug still exists in 5.1 (at least in mysql-5.1-bugteam)
[18 Sep 2008 14:00] Sergey Petrunya
Mattias, which testcase did you run? The original testcase isn't available anymore (I suppose that's not the testcase from the patch because that uses Falcon while you're speaking about 5.1)

Please post the instructions to re-create the failure.
[18 Sep 2008 14:57] Mattias Jonsson
Is this bug a duplicate of bug#33257 ? (the last proposed patch indicates that)

I used the original test case and the data in the bug-data-30573.zip (from ftp upload)

(which affects partitioned myisam, and exists in 5.1, I have tested with a backport of your patch, http://lists.mysql.com/commits/40373 and it fixes the bug. I will probably commit and push it as bug#33555)
[19 Sep 2008 4:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/54328

2688 Mattias Jonsson	2008-09-18
      Bug#30573: Ordered range scan over partitioned tables returns some rows twice
      and
      Bug#33555: Group By Query does not correctly aggregate partitions
      
      Backport of bug-33257 which is the same bug.
      
      read_range_*() calls was not passed to the partition handlers,
      but was translated to index_read/next family calls.
      Resulting in duplicates rows and wrong aggregations.
[9 Oct 2008 18:02] Bugs System
Pushed into 5.1.30  (revid:mattias.jonsson@sun.com-20080918194934-o50s3e3ohrzifira) (version source revid:kgeorge@mysql.com-20081001094725-vf4mqjkmajlm22qy) (pib:4)
[17 Oct 2008 16:45] Bugs System
Pushed into 6.0.8-alpha  (revid:mattias.jonsson@sun.com-20080918194934-o50s3e3ohrzifira) (version source revid:kgeorge@mysql.com-20081001100520-exs1tupnfanm1mij) (pib:5)
[28 Oct 2008 21:05] Bugs System
Pushed into 5.1.29-ndb-6.2.17  (revid:mattias.jonsson@sun.com-20080918194934-o50s3e3ohrzifira) (version source revid:tomas.ulin@sun.com-20081028140209-u4emkk1xphi5tkfb) (pib:5)
[28 Oct 2008 22:23] Bugs System
Pushed into 5.1.29-ndb-6.3.19  (revid:mattias.jonsson@sun.com-20080918194934-o50s3e3ohrzifira) (version source revid:tomas.ulin@sun.com-20081028194045-0353yg8cvd2c7dd1) (pib:5)
[1 Nov 2008 9:49] Bugs System
Pushed into 5.1.29-ndb-6.4.0  (revid:mattias.jonsson@sun.com-20080918194934-o50s3e3ohrzifira) (version source revid:jonas@mysql.com-20081101082305-qx5a1bj0z7i8ueys) (pib:5)