Bug #43134 select with distinct or group by yields incorrect records
Submitted: 24 Feb 2009 10:10 Modified: 11 Nov 2009 9:14
Reporter: Mike Jenkins Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Partitions Severity:S2 (Serious)
Version:5.1.24 and 5.1.31 OS:Linux (Centos 4)
Assigned to: CPU Architecture:Any

[24 Feb 2009 10:10] Mike Jenkins
Description:
Table has some 130m entries across 9 partitions hashed on year(rundate), see structure below.

Record content....Rundate ranges from 20030203 and increments by 1 for each South African working day to 20090220.

For each RunDate there are some 250 Filename entries and for each Filename entry there are 400 records with Date ranging from 400 working days ago to RunDate.

See "How to Repeat" for samples with output.

When selecting with order by, no problem.

When selecting distinct RunDate, incorrect RunDate extracted.

When selecting with group by RunDate, incorrect RunDate extracted.

Info below self-explanatory ...

CREATE TABLE `TTprices` (
  `FileName` varchar(9) NOT NULL DEFAULT '',
  `RunDate` varchar(8) NOT NULL DEFAULT '',
  `Date` varchar(8) NOT NULL DEFAULT '',
  `ClPr` int(11) unsigned NOT NULL DEFAULT '0',
  `HiPr` int(11) unsigned NOT NULL DEFAULT '0',
  `LoPr` int(11) unsigned NOT NULL DEFAULT '0',
  `OpPr` int(11) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`FileName`,`RunDate`,`Date`),
  KEY `Rundate_key` (`RunDate`,`FileName`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
/*!50100 PARTITION BY HASH (year(RunDate))
PARTITIONS 9 */; 

mysql> select count(*) from TTprices;
+-----------+
| count(*)  |
+-----------+
| 130540853 |
+-----------+
1 row in set (0.00 sec)

+----------+----------+----------+-------+-------+-------+-------+
| FileName | RunDate  | Date     | ClPr  | HiPr  | LoPr  | OpPr  |
+----------+----------+----------+-------+-------+-------+-------+
| ABIL     | 20070102 | 20050530 |  1660 |  1685 |  1655 |  1685 |
| ABIL     | 20070102 | 20050531 |  1650 |  1670 |  1640 |  1670 |
| ABIL     | 20070102 | 20050601 |  1650 |  1668 |  1635 |  1635 |
| ABIL     | 20070102 | 20050602 |  1730 |  1730 |  1645 |  1665 |
| ABIL     | 20070102 | 20050603 |  1739 |  1739 |  1685 |  1730 |
| ABIL     | 20070102 | 20050606 |  1780 |  1790 |  1720 |  1720 |
| ABIL     | 20070102 | 20050607 |  1799 |  1800 |  1770 |  1798 |
| ABIL     | 20070102 | 20050608 |  1800 |  1821 |  1792 |  1799 |
| ABIL     | 20070102 | 20050609 |  1840 |  1860 |  1799 |  1800 |
| ABIL     | 20070102 | 20050610 |  1790 |  1813 |  1775 |  1813 |
| ABIL     | 20070102 | 20050613 |  1820 |  1820 |  1790 |  1800 |
+----------+----------+----------+-------+-------+-------+-------+

[root@honeypot opt]# ls -l TTprices*
-rw-rw----  1 mysql mysql       8754 Feb 16 20:28 TTprices.frm
-rw-rw----  1 mysql mysql         56 Feb 16 20:28 TTprices.par
-rw-rw----  1 mysql mysql 1074014400 Feb 22 08:12 TTprices#P#p0.MYD
-rw-rw----  1 mysql mysql  482980864 Feb 24 09:33 TTprices#P#p0.MYI
-rw-rw----  1 mysql mysql 1100946332 Feb 23 13:34 TTprices#P#p1.MYD
-rw-rw----  1 mysql mysql  496284672 Feb 24 09:33 TTprices#P#p1.MYI
-rw-rw----  1 mysql mysql  162638400 Feb 23 18:07 TTprices#P#p2.MYD
-rw-rw----  1 mysql mysql   72889344 Feb 24 09:33 TTprices#P#p2.MYI
-rw-rw----  1 mysql mysql          0 Feb 16 20:28 TTprices#P#p3.MYD
-rw-rw----  1 mysql mysql       1024 Feb 24 09:33 TTprices#P#p3.MYI
-rw-rw----  1 mysql mysql          0 Feb 16 20:28 TTprices#P#p4.MYD
-rw-rw----  1 mysql mysql       1024 Feb 24 09:33 TTprices#P#p4.MYI
-rw-rw----  1 mysql mysql  910136000 Feb 17 16:31 TTprices#P#p5.MYD
-rw-rw----  1 mysql mysql  417310720 Feb 24 09:33 TTprices#P#p5.MYI
-rw-rw----  1 mysql mysql  948072000 Feb 19 06:51 TTprices#P#p6.MYD
-rw-rw----  1 mysql mysql  433101824 Feb 24 09:33 TTprices#P#p6.MYI
-rw-rw----  1 mysql mysql  921056000 Feb 20 02:32 TTprices#P#p7.MYD
-rw-rw----  1 mysql mysql  418001920 Feb 24 09:33 TTprices#P#p7.MYI
-rw-rw----  1 mysql mysql 1012289600 Feb 21 07:01 TTprices#P#p8.MYD
-rw-rw----  1 mysql mysql  458118144 Feb 24 09:33 TTprices#P#p8.MYI

How to repeat:
mysql> select RunDate from TTprices where RunDate>='20040204' group by RunDate limit 5;
+----------+
| RunDate  |
+----------+
| 20070102 |
| 20070103 |<<<<<<<<<< ALL WRONG - group by
| 20070104 |
| 20070105 |
| 20070108 |
+----------+
5 rows in set (0.02 sec)

mysql> select RunDate from TTprices where RunDate>='20040204' order by RunDate limit 5;
+----------+
| RunDate  |
+----------+
| 20040204 |
| 20040204 |<<<<<<<<<< ALL RIGHT - order by
| 20040204 |
| 20040204 |
| 20040204 |
+----------+
5 rows in set (0.00 sec)

mysql> select distinct RunDate from TTprices where RunDate>='20040204' order by RunDate limit 5;
+----------+
| RunDate  |
+----------+
| 20070102 |
| 20070103 |<<<<<<<<<< ALL WRONG - distinct
| 20070104 |
| 20070105 |
| 20070108 |
+----------+
5 rows in set (0.00 sec)

mysql> select RunDate from TTprices where RunDate>='20070204' group by RunDate limit 5;
+----------+
| RunDate  |
+----------+
| 20070205 |
| 20070206 |<<<<<<<<<< ALL RIGHT - group by
| 20070207 |
| 20070208 |
| 20070209 |
+----------+
5 rows in set (0.07 sec)

mysql> select RunDate from TTprices where RunDate>='20070204' order by RunDate limit 5;
+----------+
| RunDate  |
+----------+
| 20070205 |
| 20070205 |<<<<<<<<<< ALL RIGHT - order by
| 20070205 |
| 20070205 |
| 20070205 |
+----------+
5 rows in set (0.00 sec)

mysql> select distinct RunDate from TTprices where RunDate>='20070204' order by RunDate limit 5;
+----------+
| RunDate  |
+----------+
| 20070205 |
| 20070206 |<<<<<<<<<< ALL RIGHT - distinct
| 20070207 |
| 20070208 |
| 20070209 |
+----------+
5 rows in set (0.00 sec)
[24 Feb 2009 10:46] Sveta Smirnova
Thank you for the report.

You wrote:

mysql> select RunDate from TTprices where RunDate>='20040204' group by RunDate limit 5;
+----------+
| RunDate  |
+----------+
| 20070102 |
| 20070103 |<<<<<<<<<< ALL WRONG - group by
| 20070104 |
| 20070105 |
| 20070108 |
+----------+
5 rows in set (0.02 sec)

Returns 5 different RunDate each of them >='20040204' What is wrong here?

You wrote:

mysql> select distinct RunDate from TTprices where RunDate>='20040204' order by RunDate
limit 5;
+----------+
| RunDate  |
+----------+
| 20070102 |
| 20070103 |<<<<<<<<<< ALL WRONG - distinct
| 20070104 |
| 20070105 |
| 20070108 |
+----------+
5 rows in set (0.00 sec)

Returns 5 different RunDate each of them >='20040204' What is wrong here?
[24 Feb 2009 11:37] Mike Jenkins
Hi,

It should return data for 20040204 onwards (see below) but it is actually returning data for 2007 onwards and from the 1st record in 2007.

+----------+
| RunDate  |
+----------+
| 20040204 |
| 20040205 |
| 20040206 |
| 20040209 |
| 20040210 |
+----------+

Mike
[24 Feb 2009 17:12] Sveta Smirnova
Thank you for the feedback.

Please provide output of EXPLIAN PARTITIONS for SELECT statements provided.
[24 Feb 2009 19:40] Mike Jenkins
Contains EXPLAIN PARTITION of each of the 6 select statements.

Attachment: explainpart.txt (text/plain), 5.47 KiB.

[25 Feb 2009 14:30] Mike Jenkins
Have overcome problem by renaming table and creating a new one and then doing an...insert into TTprices select * from TTpricesOld where RunDate<'20040101'; so I only get 2003 records.  I then do all batch processing and truncate TTprices and then repeat for 2004 etc.

Workaround fine for me but when I need this online ...
[7 Sep 2009 6:21] Susanne Ebrecht
Sorry but I am confused about the version numbers.

On which version do you have this problem?
[7 Oct 2009 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[11 Oct 2009 8:42] Sveta Smirnova
Version numbers corrected.
[11 Oct 2009 9:14] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.1.39, and inform about the results.
[12 Nov 2009 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".