Bug #94906 select group_concat(distinct col0, col1) seems to be wrong
Submitted: 4 Apr 2019 10:12 Modified: 8 Apr 2019 14:06
Reporter: Huaiyu Xu Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.7.23 OS:MacOS (10.12.6)
Assigned to: CPU Architecture:x86 (2.7 GHz Intel Core i5)
Tags: aggregation function

[4 Apr 2019 10:12] Huaiyu Xu
Description:
CREATE TABLE `t` (
  `col_float_unsigned_key` float unsigned DEFAULT NULL,
  `col_bigint_unsigned_key` bigint(20) unsigned DEFAULT NULL,
  `col_decimal_6_3_unsigned` decimal(6,3) unsigned DEFAULT NULL
);

insert into t values(12.9873,13,0.000),(12.9873,13,0.000),(12.9873,0,0.000),(12.9873,NULL,0.000),(12.9873,13,12.987),(12.9873,1,30.000),(12.9873,1,0.112);

select col_float_unsigned_key,  group_concat(distinct `col_bigint_unsigned_key`, `col_decimal_6_3_unsigned`) from t ;

mysql> select col_float_unsigned_key,  group_concat(distinct `col_bigint_unsigned_key`, `col_decimal_6_3_unsigned`) from t ;
+------------------------+------------------------------------------------------------------------------+
| col_float_unsigned_key | group_concat(distinct `col_bigint_unsigned_key`, `col_decimal_6_3_unsigned`) |
+------------------------+------------------------------------------------------------------------------+
|                12.9873 | 00.000,10.112,130.000,130.000,1312.987                                       |
+------------------------+------------------------------------------------------------------------------+
1 row in set (0.00 sec)

There are two `130.000`s.
I think it's unexcepted.

How to repeat:
CREATE TABLE `t` (
  `col_float_unsigned_key` float unsigned DEFAULT NULL,
  `col_bigint_unsigned_key` bigint(20) unsigned DEFAULT NULL,
  `col_decimal_6_3_unsigned` decimal(6,3) unsigned DEFAULT NULL
);

insert into t values(12.9873,13,0.000),(12.9873,13,0.000),(12.9873,0,0.000),(12.9873,NULL,0.000),(12.9873,13,12.987),(12.9873,1,30.000),(12.9873,1,0.112);

select col_float_unsigned_key,  group_concat(distinct `col_bigint_unsigned_key`, `col_decimal_6_3_unsigned`) from t ;
[4 Apr 2019 13:14] MySQL Verification Team
Hi,

Thank you for your bug report.

However, this is not a bug. You are simply using that function in the way that is not how it is designed to function. Among other problems with your query is that your query is not an aggregating one.

You can read more on this subject in our Reference Manual, in the following chapter:

12.19.1 Aggregate (GROUP BY) Function Descriptions

Not a bug.
[8 Apr 2019 3:03] Huaiyu Xu
Hi, is the following sql an aggregating one?
And, I just do not know why there are two `130.000`s?

mysql> select   group_concat(distinct `col_bigint_unsigned_key`, `col_decimal_6_3_unsigned`) from t group by col_float_unsigned_key;
+------------------------------------------------------------------------------+
| group_concat(distinct `col_bigint_unsigned_key`, `col_decimal_6_3_unsigned`) |
+------------------------------------------------------------------------------+
| 00.000,10.112,130.000,130.000,1312.987                                       |
+------------------------------------------------------------------------------+
1 row in set (0.00 sec)
[8 Apr 2019 13:13] MySQL Verification Team
Hi again,

I have forwarded you to our documentation.

It says that this function returns a concatenated string. Hence, it is up to you to format that string. What you are getting are values of both columns that are concatenated.
[8 Apr 2019 13:32] Norvald Ryeng
Hi,

I agree with the "not a bug" decision. Here's what happens in this query:

It is a query with an aggregation function, but without any GROUP BY clause. The result is implicit grouping into a single group. Because we also select the ungrouped column, we need to turn off sql_mode=only_full_group_by to make it work. I used SET sql_mode='' in my testing.

GROUP_CONCAT(col1, col2) will concatenate the col1 and col2 values for a single row together without any separator. Then, it will concatenate the rows using a comma as separator. Since there is a DISTINCT clause in the GROUP_CONCAT, only distinct combinations of (col1, col2) are used. Note: The column values must be distinct, but the resulting concatenation may not be.

Let's split up the query and look at the parts.

First, we look at the distinct combinations of the two columns:

mysql> select distinct `col_bigint_unsigned_key`, `col_decimal_6_3_unsigned` from t;
+-------------------------+--------------------------+
| col_bigint_unsigned_key | col_decimal_6_3_unsigned |
+-------------------------+--------------------------+
|                      13 |                    0.000 |
|                       0 |                    0.000 |
|                    NULL |                    0.000 |
|                      13 |                   12.987 |
|                       1 |                   30.000 |
|                       1 |                    0.112 |
+-------------------------+--------------------------+
6 rows in set (0,00 sec)

That is what GROUP_CONCAT computes internally before concatenating. We can then concatenate each row:

mysql> select concat(`col_bigint_unsigned_key`, `col_decimal_6_3_unsigned`) from (select distinct `col_bigint_unsigned_key`, `col_decimal_6_3_unsigned` from t) as a;
+---------------------------------------------------------------+
| concat(`col_bigint_unsigned_key`, `col_decimal_6_3_unsigned`) |
+---------------------------------------------------------------+
| 130.000                                                       |
| 00.000                                                        |
| NULL                                                          |
| 1312.987                                                      |
| 130.000                                                       |
| 10.112                                                        |
+---------------------------------------------------------------+
6 rows in set (0,00 sec)

Here, as in the original query result, the string "130.000" appears twice. That is because one is the concatenation of "13" and "0.000", while the other is the concatenation of "1" and "30.000". The two rows that we concatenate are distinct, but the result of the concatenation is not.

If we use GROUP_CONCAT, the NULL value is skipped, so we get only the other values:

mysql> select group_concat(`col_bigint_unsigned_key`, `col_decimal_6_3_unsigned`) from (select distinct `col_bigint_unsigned_key`, `col_decimal_6_3_unsigned` from t) as a;
+---------------------------------------------------------------------+
| group_concat(`col_bigint_unsigned_key`, `col_decimal_6_3_unsigned`) |
+---------------------------------------------------------------------+
| 130.000,00.000,1312.987,130.000,10.112                              |
+---------------------------------------------------------------------+
1 row in set (0,00 sec)

And that is why "130.000" appears twice in the result.

Note: The order is different from the result posted above, but that's because the query has no ORDER BY clause. Please note that implicit ordering of GROUP BY was removed in 8.0.

I hope that explains it!

Best regards,

Norvald
[8 Apr 2019 13:36] MySQL Verification Team
Norvald,

Thank you very much for your comment.
[8 Apr 2019 14:06] Huaiyu Xu
Okay, I got it.
Thank you all.
[8 Apr 2019 14:09] MySQL Verification Team
Not a problem.

 It was a pleasure.