Bug #32798 DISTINCT in GROUP_CONCAT clause fails when ordering by a column with null values
Submitted: 27 Nov 2007 23:09 Modified: 28 Jan 2008 17:41
Reporter: Tim Harper Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.45/5.1/4.1 OS:Windows (Vista)
Assigned to: Martin Hansson CPU Architecture:Any
Tags: distinct, group_concat

[27 Nov 2007 23:09] Tim Harper
Description:
If I want a distinct list of values via group_concat, ordered by a column that happens to have a null value in it, group_concat falls apart.

How to repeat:
Execute the following to create test data:

CREATE TABLE `test` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `value` int(10) unsigned default NULL,
  `order` int(10) unsigned default NULL,
  PRIMARY KEY  (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1;

INSERT INTO `test` VALUES (1,1,NULL),(2,2,NULL),(3,3,NULL),(4,3,1);

----

This fails:
SELECT GROUP_CONCAT(DISTINCT value order by `order`) FROM test t;

(returns 3,2,1,3)

This succeeds
SELECT GROUP_CONCAT(DISTINCT value order by if(`order`, 0, `order`)) FROM test t;

(returns 3,2,1)
[27 Nov 2007 23:12] Tim Harper
I've verified this on linux as well, MySQL 5.0.27
[27 Nov 2007 23:26] Tim Harper
Another note:

This works:
SELECT GROUP_CONCAT(DISTINCT value order by if(`order`, 0, `order`)) FROM test t;

But this does not:
SELECT GROUP_CONCAT(DISTINCT value order by if(`order`, `order`, 0)) FROM test t;

You'd expect it to be the other way around, as the prior if statement produces mixed null / non-null values to order by, and the latter produces all non-null values, which under normal circumstances (not using a function) would not manifest the bug this report describes.

Additionally, this does not work either:

SELECT GROUP_CONCAT(DISTINCT value order by COALESCE(`order`, 0)) FROM test t;

The only feasable work-around I've been able to find is make sure any column I'm sorting by in a GROUP_CONCAT is "DEFAULT 0 NOT NULL", unfortunately.
[28 Nov 2007 0:26] MySQL Verification Team
Thank you for the bug report.

mysql> SELECT GROUP_CONCAT(DISTINCT value order by `order`) FROM test t;
+-----------------------------------------------+
| GROUP_CONCAT(DISTINCT value order by `order`) |
+-----------------------------------------------+
| 3,2,1,3                                       |
+-----------------------------------------------+
1 row in set (0.00 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.0.52-nt |
+-----------+
1 row in set (0.00 sec)

mysql>

mysql> SELECT GROUP_CONCAT(DISTINCT value order by `order`) FROM test t;
+-----------------------------------------------+
| GROUP_CONCAT(DISTINCT value order by `order`) |
+-----------------------------------------------+
| 3,2,1,3                                       |
+-----------------------------------------------+
1 row in set (0.00 sec)

mysql> select version();
+--------------+
| version()    |
+--------------+
| 5.1.23-rc-nt |
+--------------+
1 row in set (0.00 sec)

mysql>

mysql> SELECT GROUP_CONCAT(DISTINCT value order by `order`) FROM test t;
+-----------------------------------------------+
| GROUP_CONCAT(DISTINCT value order by `order`) |
+-----------------------------------------------+
| 3,2,1,3                                       |
+-----------------------------------------------+
1 row in set (0.00 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 4.1.24-nt |
+-----------+
1 row in set (0.00 sec)

mysql>
[30 Nov 2007 13:43] Martin Hansson
Found this comment in the code.

/*****************************************************************************
 GROUP_CONCAT function

 SQL SYNTAX:
  GROUP_CONCAT([DISTINCT] expr,... [ORDER BY col [ASC|DESC],...]
    [SEPARATOR str_const])

 concat of values from "group by" operation

 BUGS
   DISTINCT and ORDER BY only works if ORDER BY uses all fields and only fields
   in expression list
   Blobs doesn't work with DISTINCT or ORDER BY
*****************************************************************************/

If you put such a comment in the code instead of fixing the bug, it suggests that it's fairly large task to fix the bug. I suspect this is the case.
[6 Dec 2007 9:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39390

ChangeSet@1.2546, 2007-12-06 10:47:13+01:00, mhansson@linux-st28.site +4 -0
  Bug#32798: DISTINCT in GROUP_CONCAT clause fails when ordering by a column 
  with null values
  
  For queries containing GROUP_CONCAT(DISTINCT fields ORDER BY fields), there 
  was a limitation that the DISTINCT fields had to be the same as ORDER BY 
  fields, owing to the fact that one single sorted tree was used for keeping 
  track of tuples, ordering and uniqueness. Fixed by introducing a second 
  structure to handle uniqueness so that the original structure has only to 
  order the result.
[11 Dec 2007 11:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39703

ChangeSet@1.2589, 2007-12-11 12:04:00+01:00, mhansson@linux-st28.site +3 -0
  Bug#32798: DISTINCT in GROUP_CONCAT clause fails when ordering 
  by a column with null values
  
  There is only one data structure to keep track of both
  sorting and uniqueness when using 
  GROUP_CONCAT(DISTINCT <fields> ORDER BY <fields>), which
  gives wrong result in the general case.
  Fixed in 5.0 by enforcing a rule that expressions cannot 
  be used and that one list of fields must be a prefix of the other.
[12 Dec 2007 11:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39769

ChangeSet@1.2587, 2007-12-12 12:29:46+01:00, mhansson@linux-st28.site +4 -0
  Bug#32798: DISTINCT in GROUP_CONCAT clause fails when ordering by a column 
  with null values
  
  For queries containing GROUP_CONCAT(DISTINCT fields ORDER BY fields), there 
  was a limitation that the DISTINCT fields had to be the same as ORDER BY 
  fields, owing to the fact that one single sorted tree was used for keeping 
  track of tuples, ordering and uniqueness. Fixed in 5.1 and later by 
  introducing a second structure to handle uniqueness so that the original 
  structure has only to order the result.
[13 Dec 2007 10:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39878

ChangeSet@1.2587, 2007-12-12 12:25:51+01:00, mhansson@linux-st28.site +4 -0
  Bug#32798: DISTINCT in GROUP_CONCAT clause fails when ordering by a column 
  with null values
  
  For queries containing GROUP_CONCAT(DISTINCT fields ORDER BY fields), there 
  was a limitation that the DISTINCT fields had to be the same as ORDER BY 
  fields, owing to the fact that one single sorted tree was used for keeping 
  track of tuples, ordering and uniqueness. Fixed in 5.1 and later by 
  introducing a second structure to handle uniqueness so that the original 
  structure has only to order the result.
[13 Dec 2007 13:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39888

ChangeSet@1.2587, 2007-12-13 14:04:53+01:00, mhansson@linux-st28.site +4 -0
  Bug#32798: DISTINCT in GROUP_CONCAT clause fails when ordering by a column 
  with null values
  
  For queries containing GROUP_CONCAT(DISTINCT fields ORDER BY fields), there 
  was a limitation that the DISTINCT fields had to be the same as ORDER BY 
  fields, owing to the fact that one single sorted tree was used for keeping 
  track of tuples, ordering and uniqueness. Fixed by introducing a second 
  structure to handle uniqueness so that the original structure has only to 
  order the result.
[14 Dec 2007 10:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39976

ChangeSet@1.2546, 2007-12-14 11:44:32+01:00, mhansson@linux-st28.site +4 -0
  Bug#32798: DISTINCT in GROUP_CONCAT clause fails when ordering by a column 
  with null values
  
  For queries containing GROUP_CONCAT(DISTINCT fields ORDER BY fields), there 
  was a limitation that the DISTINCT fields had to be the same as ORDER BY 
  fields, owing to the fact that one single sorted tree was used for keeping 
  track of tuples, ordering and uniqueness. Fixed by introducing a second 
  structure to handle uniqueness so that the original structure has only to 
  order the result.
[14 Dec 2007 11:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39980

ChangeSet@1.2546, 2007-12-14 12:24:20+01:00, mhansson@linux-st28.site +4 -0
  Bug#32798: DISTINCT in GROUP_CONCAT clause fails when ordering by a column 
  with null values
  
  For queries containing GROUP_CONCAT(DISTINCT fields ORDER BY fields), there 
  was a limitation that the DISTINCT fields had to be the same as ORDER BY 
  fields, owing to the fact that one single sorted tree was used for keeping 
  track of tuples, ordering and uniqueness. Fixed by introducing a second 
  structure to handle uniqueness so that the original structure has only to 
  order the result.
[14 Dec 2007 12:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39983

ChangeSet@1.2588, 2007-12-14 13:33:39+01:00, mhansson@linux-st28.site +1 -0
  Bug#32798:Post-merge fix
[14 Dec 2007 12:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39984

ChangeSet@1.2588, 2007-12-14 13:35:52+01:00, mhansson@linux-st28.site +1 -0
  Bug#32798: DISTINCT in GROUP_CONCAT clause fails when ordering by 
  a column with null values
  
  Post-merge fix
[11 Jan 2008 12:17] Bugs System
Pushed into 6.0.5-alpha
[11 Jan 2008 12:20] Bugs System
Pushed into 5.1.23-rc
[11 Jan 2008 12:22] Bugs System
Pushed into 5.0.56
[28 Jan 2008 17:41] Paul DuBois
Noted in 5.0.56, 5.1.23, 6.0.5 changelogs.

For queries containing GROUP_CONCAT(DISTINCT col_list ORDER BY
col_list), there was a limitation that the DISTINCT columns had to be
the same as ORDER BY columns. Incorrect results could be returned if
this was not true.
[8 May 2008 2:41] Paul DuBois
This bugfix caused a regression that was fixed as Bug#35298.