Bug #23856 GROUP_CONCAT and ORDER BY: junk from previous rows for query on I_S
Submitted: 1 Nov 2006 21:06 Modified: 7 Jun 2007 16:25
Reporter: Roland Bouman Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.1.13bk, 5.0 BK OS:Linux (Linux)
Assigned to: Martin Hansson CPU Architecture:Any
Tags: group_concat, order by

[1 Nov 2006 21:06] Roland Bouman
Description:
Group_concat sometimes 'inherits' junk data from previous rows (seems like the buffer is not emptied)

How to repeat:
use information_schema;

select   group_concat(column_type order by ordinal_position) 
from     columns 
group by table_name;

this returns non-sense like:

| float ) unsigned,float ) unsi,float ) unsigned,float                                                                                         |
| text ar(255) ned,text ar(255),text  

without the order by, everything is fine:

| int(10) unsigned,varchar(255),int(10) unsigned,float                                                                                               |
| int(10) unsigned,varchar(255),text  

mysql> select version();
+-----------------+
| version()       |
+-----------------+
| 5.1.13-beta-log |
+-----------------+
1 row in set (0.00 sec)

Suggested fix:
uhm..fix the buffer, don't output non-sense
[1 Nov 2006 21:58] Sveta Smirnova
Thank you for the report.

I can not repeat the issue on Linux using last BK sources.

Please, provide output of SHOW CREATE TABLE for that tables columns from which are broken in the output of GROUP_CONCAT.
[2 Nov 2006 7:10] Sveta Smirnova
Looks like you didn't understand my question.

Try issue statement

USE INFORMATION_SCHEMA;

SELECT   GROUP_CONCAT(COLUMN_TYPE ORDER BY ORDINAL_POSITION), TABLE_NAME 
FROM     COLUMNS 
GROUP BY TABLE_NAME;

Then find broken data in GROUP_CONCAT(COLUMN_TYPE ORDER BY ORDINAL_POSITION) column and provide output of SHOW CREATE TABLE table_name statement, where table_name you'll find in TABLE_NAME column of output of your SELECT statement.
[2 Nov 2006 7:35] Roland Bouman
mysql> USE I 
mysql> SELECT   GROUP_CONCAT(COLUMN_TYPE ORDER BY ORDINAL_POSITION)
    -> FROM     COLUMNS
    -> WHERE    TABLE_SCHEMA = schema()
    -> AND      TABLE_NAME = 'TRIGGERS'
    -> ;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| GROUP_CONCAT(COLUMN_TYPE ORDER BY ORDINAL_POSITION)                                                                                         |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| longtext )  ,longtext ) ,longtext ) ,longtext ),longtext )  ,longtext ) ,longtext ) ,longtext ,longtext,longtext,longtext ),longtext ),longtext ) ,longtext ) ,longtext ),longtext ),longtext,longtext,longtext |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql> show create table TRIGGERS;
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table    | Create Table                                             |
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TRIGGERS | CREATE TEMPORARY TABLE "TRIGGERS" (
  "TRIGGER_CATALOG" varchar(512) DEFAULT NULL,
  "TRIGGER_SCHEMA" varchar(64) NOT NULL DEFAULT '',
  "TRIGGER_NAME" varchar(64) NOT NULL DEFAULT '',
  "EVENT_MANIPULATION" varchar(6) NOT NULL DEFAULT '',
  "EVENT_OBJECT_CATALOG" varchar(512) DEFAULT NULL,
  "EVENT_OBJECT_SCHEMA" varchar(64) NOT NULL DEFAULT '',
  "EVENT_OBJECT_TABLE" varchar(64) NOT NULL DEFAULT '',
  "ACTION_ORDER" bigint(4) NOT NULL DEFAULT '0',
  "ACTION_CONDITION" longtext,
  "ACTION_STATEMENT" longtext NOT NULL,
  "ACTION_ORIENTATION" varchar(9) NOT NULL DEFAULT '',
  "ACTION_TIMING" varchar(6) NOT NULL DEFAULT '',
  "ACTION_REFERENCE_OLD_TABLE" varchar(64) DEFAULT NULL,
  "ACTION_REFERENCE_NEW_TABLE" varchar(64) DEFAULT NULL,
  "ACTION_REFERENCE_OLD_ROW" varchar(3) NOT NULL DEFAULT '',
  "ACTION_REFERENCE_NEW_ROW" varchar(3) NOT NULL DEFAULT '',
  "CREATED" datetime DEFAULT NULL,
  "SQL_MODE" longtext NOT NULL,
  "DEFINER" longtext NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 |
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>
[2 Nov 2006 7:37] Roland Bouman
Whoops, sorry, seems I hit the submit button to fast. Sorry for the not so nice output in the previous comment (and the missing first line:

mysql> USE INFORMATION_SCHEMA;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

)

But you get the idea - is see lots of 

"longtext   )"

there
[2 Nov 2006 12:18] Sveta Smirnova
I still can not repeat the problem in my environment.

Please, provide content of your configuration file and output of SHOW VARIABLES LIKE 'collation%'; and SHOW VARIABLES LIKE '%char%'; statements
[2 Nov 2006 13:07] Roland Bouman
Hi sveta, here is the SHOW output as requested:

mysql> SHOW
    -> VARIABLES LIKE 'collation%';
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database   | utf8_general_ci   |
| collation_server     | latin1_swedish_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)

mysql> SHOW VARIABLES LIKE '%char%';
+--------------------------+-----------------------------------------------+
| Variable_name            | Value                                         |
+--------------------------+-----------------------------------------------+
| character_set_client     | latin1                                        |
| character_set_connection | latin1                                        |
| character_set_database   | utf8                                          |
| character_set_filesystem | binary                                        |
| character_set_results    | latin1                                        |
| character_set_server     | latin1                                        |
| character_set_system     | utf8                                          |
| character_sets_dir       | /opt/mysql/mysql-5.1.13/share/mysql/charsets/ |
+--------------------------+-----------------------------------------------+
8 rows in set (0.00 sec)

providing /etc/my.cnf as attachment (yes, it is non-standard, possible culprits are the sql_mode and group_concat_max_len settings)
[2 Nov 2006 13:09] Roland Bouman
My /etc/my.cnf

Attachment: my.cnf (application/octet-stream, text), 17.66 KiB.

[2 Nov 2006 13:10] Roland Bouman
sveta, if its any help, this is how i compile:

cd mysql-5.1

sudo BUILD/compile-pentium-max \
--prefix=/opt/mysql/mysql-5.1.13 \

sudo make install
[3 Nov 2006 9:16] Sveta Smirnova
Verified as described on Linux using last BK sources. Version 5.0 is affected too.

group_concat_max_len=65535 is cause

Workaround: reduce group_concat_max_len to maximum 65534
[3 Nov 2006 9:17] Sveta Smirnova
OS changed to specify all are affected.
[17 Feb 2007 19:17] Pete Harlan
Another script that reproduces the group_concat_max_len problem

Attachment: bugreport.sql (text/x-sql), 2.19 KiB.

[17 Feb 2007 19:28] Pete Harlan
I uploaded a script with a small test-case that exhibits the problem, on the off chance it helps someone track this down.

The problem happens in the official community binaries for 5.0.27, as well as the latest BK sources (as of today) for 5.0.36, both tested on Debian Etch x86_64.

I wasn't able to reproduce the problem in 4.1.22.

It would have been nice from my perspective if setting the length to more than 2^16 - 2 had resulted in an error, rather than silently wrong results, this being a known bug.

I also noticed when looking at the code that group_concat_max_len is stored as a ulong, but its value is later assigned to a uint32, so if you set it to 2^32 + N it's as if you had set it to N.  Perhaps while someone is working on this they would want to disallow setting it to a larger value than 2^32 - 1.  This is obviously not that big a deal, but I'm mentioning it because I noticed it.

Thank you.
[20 Feb 2007 8:08] Roland Bouman
I can reproduce this on 5.1.15 with group_concat_max_len values smaller than 65534:

this query:

select group_concat(
           column_type 
       order by table_schema
       ,        table_name
       ,        ordinal_position
       ) 
from   information_schema.columns 
where  table_schema = 'mysql'
;

will show the problem when group_concat_max_len > 21844. With 21844, it will perform correctly.

However, I don't feel this is really a workaround - a result larger then this number will be truncated, and there is no good solution in these cases.
[5 Apr 2007 23:53] Kevin Regan
I'm seeing the same issue with 5.0.37.  The previously posted 21844 seems to be the magic number.  However, it is difficult to trust this call at all in its current state.

--Kevin

mysql> SELECT   GROUP_CONCAT(COLUMN_TYPE ORDER BY ORDINAL_POSITION) FROM     COLUMNS WHERE    TABLE_SCHEMA = schema() AND      TABLE_NAME = 'TRIGGERS';
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| GROUP_CONCAT(COLUMN_TYPE ORDER BY ORDINAL_POSITION)                                                                                                                                    |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| longtext3))),longtext3)),longtext3)),longtext3),longtext3))),longtext3)),longtext3)),longtext3,longtext,longtext,longtext3),longtext3),longtext3)),longtext3)),longtext3),longtext3),longtext,longtext,longtext |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
[6 Apr 2007 15:49] Pete Harlan
> it is difficult to trust this call at all in its current state.

I agree.  I looked at trying to fix it myself when I first ran into it, but someone who already knows MySQL memory management would clearly have a much easier time untangling that code.

Five months feels to me like long enough for a known memory-corruption bug to hang around.  Is there internal documentation about MySQL's memory management, besides the source code, that might help someone figure this out?
[11 Apr 2007 15:58] Sergei Golubchik
Unfortunately, there's no complete documentation for it.
But check this presentation - http://forge.mysql.com/wiki/MysysLibraryAndAlgorithms - it describes memory management too.
[25 Apr 2007 13:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/25416

ChangeSet@1.2443, 2007-04-25 15:56:26+03:00, mhansson@linux-st28.site +6 -0
  Bug #23856:GROUP_CONCAT and ORDER BY: junk from previous rows for query on I_S
  
  Item_func_group_concat::setup, responsible for initalizing placeholder fields to represent
  the group column for GROUP_CONCAT, had misunderstood the save_sum_fields argument to 
  create_tmp_table:
  
      "We need to set set_sum_field in true for storing value of blob in buffer
       of a record instead of a pointer of one."
  
  This is not true, it's currently not possible to get this behavior out of blobs.
  It would be desirable, though, since if we could do this we could keep 
  using the TREE data structure that does the ORDER BY sorting. At the present state of MySQL,
  blob fields can't be stored in this way since always keep a pointer to the actual data. So
  the fix is to use VARCHAR for the placeholder field instead. This 
  will lead to results being cut if they exceed the maximum VARCHAR length (65535 bytes), but 
  doing more is beyond a simple bug fix.
  
  There is still one problem, there is no warning about this cutting. Consider:
  
  create table t20b(a text character set utf8, b int);
  insert into t20b values (repeat('a', 0xffff+1), 0); 
  
  mysql> select length(group_concat(a)) from t20b;
  +-------------------------+
  | length(group_concat(a)) |
  +-------------------------+
  |                   32768 |
  +-------------------------+
  1 row in set, 1 warning (0.00 sec)
  
  mysql> show warnings;
  +---------+------+--------------------------------------+
  | Level   | Code | Message                              |
  +---------+------+--------------------------------------+
  | Warning | 1260 | 1 line(s) were cut by GROUP_CONCAT() |
  +---------+------+--------------------------------------+
  1 row in set (0.00 sec)
  
  mysql> select length(group_concat(a order by b)) from t20b;
  +------------------------------------+
  | length(group_concat(a order by b)) |
  +------------------------------------+
  |                              32768 |
  +------------------------------------+
  1 row in set (0.00 sec)
  
  Notice there's no warnings. This has nothing to do with our conversion to VARCHAR, though:
  
  create table t20v(a varchar(65528), b int);
  insert into t20v values (repeat('a', 0xffff), 0); 
  
  mysql> select length(group_concat(a)) from t20v;
  +-------------------------+
  | length(group_concat(a)) |
  +-------------------------+
  |                   32768 |
  +-------------------------+
  1 row in set, 1 warning (0.00 sec)
  
  mysql> show warnings;
  +---------+------+--------------------------------------+
  | Level   | Code | Message                              |
  +---------+------+--------------------------------------+
  | Warning | 1260 | 1 line(s) were cut by GROUP_CONCAT() |
  +---------+------+--------------------------------------+
  1 row in set (0.00 sec)
  
  mysql> select length(group_concat(a order by b)) from t20v;
  +------------------------------------+
  | length(group_concat(a order by b)) |
  +------------------------------------+
  |                              32768 |
  +------------------------------------+
  1 row in set (0.00 sec)
  
  This seems to be a bug in its own respect.
[26 Apr 2007 10:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/25490

ChangeSet@1.2443, 2007-04-26 12:57:59+03:00, mhansson@linux-st28.site +5 -0
  Bug #23856:GROUP_CONCAT and ORDER BY: junk from previous rows for query on I_S
  
  Item_func_group_concat::setup, responsible for initalizing placeholder fields to represent
  the group column for GROUP_CONCAT, had misunderstood the save_sum_fields argument to 
  create_tmp_table:
  
      "We need to set set_sum_field in true for storing value of blob in buffer
       of a record instead of a pointer of one."
  
  It's currently not possible to get this behavior out of blobs.
  It would be desirable, though, since if we could do this we could keep 
  using the TREE data structure that does the ORDER BY sorting. At the present state of 
  MySQL, blob fields can't be stored in this way since they always keep a pointer to the 
  actual data.
  So the fix is to always use VARCHAR for the placeholder field instead. This 
  will lead to results being cut if they exceed the maximum VARCHAR length (65535 bytes), 
  but doing more is beyond a simple bug fix.
[26 Apr 2007 14:11] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/25511

ChangeSet@1.2443, 2007-04-26 16:11:32+03:00, mhansson@linux-st28.site +5 -0
  Bug #23856:GROUP_CONCAT and ORDER BY: junk from previous rows for query on I_S
  
  Item_func_group_concat::setup, responsible for initalizing placeholder 
  fields to represent the group column for GROUP_CONCAT, had misunderstood 
  the save_sum_fields argument to create_tmp_table:
  
      "We need to set set_sum_field in true for storing value of blob in buffer
       of a record instead of a pointer of one."
  
  It's currently not possible to get this behavior out of blobs.
  It would be desirable, though, since if we could do this we could keep 
  using the TREE data structure that does the ORDER BY sorting. At the present 
  state of MySQL, blob fields can't be stored in this way since they always keep
  a pointer to the actual data.
  
  So the fix is to always use VARCHAR for the placeholder field instead. This 
  will lead to results being cut if they exceed the maximum VARCHAR length 
  (65535 bytes), but doing more is beyond a simple bug fix.
[26 Apr 2007 15:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/25523

ChangeSet@1.2443, 2007-04-26 17:35:35+03:00, mhansson@linux-st28.site +5 -0
  Bug #23856:GROUP_CONCAT and ORDER BY: junk from previous rows for query on I_S
  
  Item_func_group_concat::setup, responsible for initalizing placeholder 
  fields to represent the group column for GROUP_CONCAT, had misunderstood 
  the save_sum_fields argument to create_tmp_table:
  
      "We need to set set_sum_field in true for storing value of blob in buffer
       of a record instead of a pointer of one."
  
  It's currently not possible to get this behavior out of blobs.
  It would be desirable, though, since if we could do this we could keep 
  using the TREE data structure that does the ORDER BY sorting. At the present 
  state of MySQL, blob fields can't be stored in this way since they always keep
  a pointer to the actual data.
  
  So the fix is to always use VARCHAR for the placeholder field instead. This 
  will lead to results being cut if they exceed the maximum VARCHAR length 
  (65535 bytes), but doing more is beyond a simple bug fix.
[14 May 2007 14:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/26616

ChangeSet@1.2443, 2007-05-14 16:43:16+03:00, mhansson@linux-st28.site +5 -0
  Bug #23856:GROUP_CONCAT and ORDER BY: junk from previous rows for query on I_S
  
  Problem:
  When group_concat() is given an ORDER BY or DISTINCT option, it uses a
  TREE structure to sort/remove duplicates. The current implementation
  incorrectly tries to store BLOB's inside the TREE structure for sorting,
  because at present BLOB Fields always keep a pointer to the actual data
  and not the data itself.
  
  In particular Item_func_group_concat::setup calls create_tmp_table with
  save_sum_fields == TRUE. According to a (removed) comment this should 
  lead to blobs being stored inside the TREE, which is wrong. This flag's 
  only effect wrt GROUP_BY is that the blob is stored using do_save_blob 
  rather than do_conv_blob. There seems to be no difference between them.
  A to do comment is left above Field_copy::set for future work. 
  ( version >= 5.1 )
  
  Solution:
  Always use VARCHAR for the placeholder field. This leads to results being
  truncated when they exceed the maximum VARCHAR length (65535 bytes).
  In such cases we print a warning. See also Bug#28273: GROUP_CONCAT and 
  ORDER BY: No warning when result gets truncated.
[16 May 2007 8:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/26782

ChangeSet@1.2443, 2007-05-16 10:53:51+03:00, mhansson@linux-st28.site +6 -0
  Bug #23856:GROUP_CONCAT and ORDER BY: junk from previous rows for query on I_S
  
  Problem:
  When group_concat() is given an ORDER BY or DISTINCT option, it uses a
  TREE structure to sort/remove duplicates. The current implementation
  incorrectly tries to store BLOB's inside the TREE structure for sorting,
  because at present BLOB Fields always keep a pointer to the actual data
  and not the data itself.
  
  In particular Item_func_group_concat::setup calls create_tmp_table with
  save_sum_fields == TRUE. According to a (removed) comment this should 
  lead to blobs being stored inside the TREE, which is wrong. This flag's 
  only effect wrt GROUP_BY is that the blob is stored using do_save_blob 
  rather than do_conv_blob. There seems to be no difference between them.
  A to do comment is left above Field_copy::set for future work. 
  ( version >= 5.1 )
  
  Solution:
  Always use VARCHAR for the placeholder field. This leads to results being
  truncated when they exceed the maximum VARCHAR length (65535 bytes).
  In such cases we print a warning. See also Bug#28273: GROUP_CONCAT and 
  ORDER BY: No warning when result gets truncated.
[23 May 2007 9:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/27192

ChangeSet@1.2491, 2007-05-23 12:00:18+02:00, mhansson@dl145s.mysql.com +1 -0
  Bug #23856: GROUP_CONCAT and ORDER BY: junk from previous 
  rows for query on I_S
  
  This is a patch to get rid of some warnings about comparisons
  between singed and unsigned ints, which turn up on sapsrv1 and
  sapsrv2.
[4 Jun 2007 21:21] Bugs System
Pushed into 5.1.20-beta
[4 Jun 2007 21:22] Bugs System
Pushed into 5.0.44
[7 Jun 2007 16:25] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented bugfix in 5.0.44 and 5.1.20 changelogs.