Bug #33555 Group By Query does not correctly aggregate partitions
Submitted: 28 Dec 2007 15:00 Modified: 10 Oct 2008 13:09
Reporter: Wayne Van Den Handel Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Partitions Severity:S2 (Serious)
Version:5.1.26RC OS:Linux (CENTOS5.2)
Assigned to: Mattias Jonsson CPU Architecture:Any
Tags: GROUP BY, partitioning, Sub Partitions

[28 Dec 2007 15:00] Wayne Van Den Handel
Description:
Group by queries against partitioned tables do not correctly aggregate data based upon the query syntax. Each partition is returned individually as a duplicate against the Group by syntax logic. The group by is against a PK column that is NOT part of partitioning.

This might be related or a duplicate of bug #30573, but since this is such a serious problem I wanted to be sure it was being resolved.

The only current workaround is to place all results into a temp table and then execute another group by query against the temp table in order to correctly aggregate the individual partitions.

How to repeat:
Table DDL:

CREATE TABLE f_num_daily(
	data_element_id int unsigned NOT NULL,
	point_id int unsigned NOT NULL,
	start_time_int int unsigned NOT NULL,
	end_time_int int unsigned NOT NULL,
	create_time timestamp NOT NULL default now(),
	create_transaction_id int unsigned NOT NULL default 0,
	value double NULL,
	value_tag char(3) NULL,
	primary key (data_element_id, point_id, start_time_int),
 	key f_num_daily_nk1 USING BTREE (create_transaction_id)
 ) ENGINE = InnoDB CHARACTER SET = 'utf8'
 PARTITION BY RANGE( start_time_int )
    SUBPARTITION BY HASH(point_id)
    SUBPARTITIONS 10 (
    PARTITION f_num_daily_min VALUES LESS THAN (1167627600),
    PARTITION f_num_daily_2007_01 VALUES LESS THAN (1170306000),
    PARTITION f_num_daily_2007_02 VALUES LESS THAN (1172725200),
    PARTITION f_num_daily_2007_03 VALUES LESS THAN (1175403600),
    PARTITION f_num_daily_2007_04 VALUES LESS THAN (1177995600),
    PARTITION f_num_daily_2007_05 VALUES LESS THAN (1180674000),
    PARTITION f_num_daily_2007_06 VALUES LESS THAN (1183266000),
    PARTITION f_num_daily_2007_07 VALUES LESS THAN (1185944400),
    PARTITION f_num_daily_2007_08 VALUES LESS THAN (1188622800),
    PARTITION f_num_daily_2007_09 VALUES LESS THAN (1191214800),
    PARTITION f_num_daily_2007_10 VALUES LESS THAN (1193893200),
    PARTITION f_num_daily_2007_11 VALUES LESS THAN (1196485200),
    PARTITION f_num_daily_2007_12 VALUES LESS THAN (1199163600),
    PARTITION f_num_daily_2008_01 VALUES LESS THAN (1201842000),
    PARTITION f_num_daily_2008_02 VALUES LESS THAN (1204347600),
    PARTITION f_num_daily_2008_03 VALUES LESS THAN (1207026000),
    PARTITION f_num_daily_2008_04 VALUES LESS THAN (1209618000),
    PARTITION f_num_daily_2008_05 VALUES LESS THAN (1212296400),
    PARTITION f_num_daily_2008_06 VALUES LESS THAN (1214888400),
    PARTITION f_num_daily_2008_07 VALUES LESS THAN (1217566800),
    PARTITION f_num_daily_2008_08 VALUES LESS THAN (1220245200),
    PARTITION f_num_daily_2008_09 VALUES LESS THAN (1222837200),
    PARTITION f_num_daily_2008_10 VALUES LESS THAN (1225515600),
    PARTITION f_num_daily_2008_11 VALUES LESS THAN (1228107600),
    PARTITION f_num_daily_2008_12 VALUES LESS THAN (1230786000),
    PARTITION f_num_daily_max VALUES LESS THAN MAXVALUE  );

Data:
Insert many records across the various main and sub partitions for create_transaction_id 82.

Query:

select	f.data_element_id,
	count(*)
from	f_num_daily f
where	create_transaction_id = 82
group by f.data_element_id

Suggested fix:
Correctly aggregate all partitions and sub partitions in order to correctly return data based upon the group by logic.
[4 Jan 2008 11:19] Susanne Ebrecht
Many thanks for writing a bug report.

I would agree with you, that these bug here is a duplicate of bug #30573.

Please try the patch from bug #30573 and let us now, if your issue still occurs with the patch.
[5 Feb 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[8 Feb 2008 13:58] Susanne Ebrecht
This is a duplicate of bug #30573
[21 Jul 2008 14:32] Wayne Van Den Handel
I finally was able to test this bug on the latest release, and the problem still occurs. As stated before, each partition is reported separately in the aggregated results.
[21 Jul 2008 14:45] Wayne Van Den Handel
As a correction to the previous posting, each sub-partition is reported separately in the group by query.
[5 Aug 2008 13:01] Susanne Ebrecht
The patch for bug #30573 is only in MySQL 6.0 not in MySQL 5.1.

Please try MySQL 6.0 or take the patch from bug #30573 for patching it into your 5.1 version.
[16 Sep 2008 13:28] Wayne Van Den Handel
It is just my opinion, but when I query returns the wrong data this type of bug can not be pushed back to a version not likely to be releases for several years. The most unacceptable bug other than a server crash for a database is to return the wrong data in a query. This tells me sub partitions are not ready in 5.1 and never will be until 6.x. Closing this bug and setting it as a dupe as before (incorrectly) only helps you to finally release 5.1, and not ensure 5.1 is as stable as it needs to be.
[18 Sep 2008 14:34] Mattias Jonsson
Hi, Thanks for reopening this bug, there has been a mistake when only include the fix for bug#30573 in 6.0, I will back port that fix and queue it for 5.1.
[19 Sep 2008 4:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/54328

2688 Mattias Jonsson	2008-09-18
      Bug#30573: Ordered range scan over partitioned tables returns some rows twice
      and
      Bug#33555: Group By Query does not correctly aggregate partitions
      
      Backport of bug-33257 which is the same bug.
      
      read_range_*() calls was not passed to the partition handlers,
      but was translated to index_read/next family calls.
      Resulting in duplicates rows and wrong aggregations.
[19 Sep 2008 9:27] Alexey Botchkov
Looks ok to push
[9 Oct 2008 18:02] Bugs System
Pushed into 5.1.30  (revid:mattias.jonsson@sun.com-20080918194934-o50s3e3ohrzifira) (version source revid:kgeorge@mysql.com-20081001094725-vf4mqjkmajlm22qy) (pib:4)
[10 Oct 2008 13:09] Jon Stephens
Documented bugfix in the 5.1.29 changelog as follows:

        Ordered range scans on partitioned tables were not always handled
        correctly. In some cases this caused some rows to be returned twice. The
        same issue also caused GROUP BY query results to be aggregated 
        incorrectly.
[17 Oct 2008 16:44] Bugs System
Pushed into 6.0.8-alpha  (revid:mattias.jonsson@sun.com-20080918194934-o50s3e3ohrzifira) (version source revid:kgeorge@mysql.com-20081001100520-exs1tupnfanm1mij) (pib:5)
[28 Oct 2008 21:04] Bugs System
Pushed into 5.1.29-ndb-6.2.17  (revid:mattias.jonsson@sun.com-20080918194934-o50s3e3ohrzifira) (version source revid:tomas.ulin@sun.com-20081028140209-u4emkk1xphi5tkfb) (pib:5)
[28 Oct 2008 22:23] Bugs System
Pushed into 5.1.29-ndb-6.3.19  (revid:mattias.jonsson@sun.com-20080918194934-o50s3e3ohrzifira) (version source revid:tomas.ulin@sun.com-20081028194045-0353yg8cvd2c7dd1) (pib:5)
[1 Nov 2008 9:48] Bugs System
Pushed into 5.1.29-ndb-6.4.0  (revid:mattias.jonsson@sun.com-20080918194934-o50s3e3ohrzifira) (version source revid:jonas@mysql.com-20081101082305-qx5a1bj0z7i8ueys) (pib:5)