Bug #13386 group_concat converts results to Server charset instead of table charset
Submitted: 21 Sep 2005 17:18 Modified: 3 May 2006 6:37
Reporter: Nathan Christiansen Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.1.14-standard OS:Linux (RH 8)
Assigned to: Ramil Kalimullin CPU Architecture:Any

[21 Sep 2005 17:18] Nathan Christiansen
Description:
The group_concat function converts its results to the server charset instead of the table's charset. This causes a query on a single table set with utf8 to return garbled results.

It happens on both engine=InnoDB and engine=MyISAM tables.

How to repeat:
mysql> CREATE TABLE test_utf8_group_concat (id int(10) unsigned NOT NULL auto_increment, text varchar(255) default NULL, PRIMARY KEY  (id)) engine=Innodb default charset=utf8;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into test_utf8_group_concat (text) values('This');
Query OK, 1 row affected (0.01 sec)

mysql> insert into test_utf8_group_concat (text) values('is');
Query OK, 1 row affected (0.01 sec)

mysql> insert into test_utf8_group_concat (text) values('my');
Query OK, 1 row affected (0.00 sec)

mysql> insert into test_utf8_group_concat (text) values('name™');
Query OK, 1 row affected (0.01 sec)

mysql> select * from test_utf8_group_concat;
+----+---------+
| id | text    |
+----+---------+
|  1 | This    |
|  2 | is      |
|  3 | my      |
|  4 | name™ |
+----+---------+
4 rows in set (0.00 sec)

mysql> select group_concat(text) from test_utf8_group_concat;
+------------------------+
| group_concat(text)     |
+------------------------+
| This,is,my,nameâ„¢ |
+------------------------+
1 row in set (0.00 sec)

Suggested fix:
I suggest that if all tables involved with the query have the same charset use that charset for the results of the function.

It should behave like a select query without the function.
[21 Sep 2005 18:20] MySQL Verification Team
I was unable to repeat with current source server:

miguel@hegel:~/dbs/4.1> bin/mysql -uroot test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2 to server version: 4.1.15-debug-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> CREATE TABLE test_utf8_group_concat (id int(10) unsigned NOT NULL
    -> auto_increment, text varchar(255) default NULL, PRIMARY KEY  (id)) engine=Innodb
    -> default charset=utf8;
Query OK, 0 rows affected (0.05 sec)

mysql>  insert into test_utf8_group_concat (text) values('This');
Query OK, 1 row affected (0.02 sec)

mysql> insert into test_utf8_group_concat (text) values('is');
Query OK, 1 row affected (0.02 sec)

mysql> insert into test_utf8_group_concat (text) values('my');
Query OK, 1 row affected (0.00 sec)

mysql> insert into test_utf8_group_concat (text) values('name™');
Query OK, 1 row affected (0.01 sec)

mysql> select * from test_utf8_group_concat;
+----+---------+
| id | text    |
+----+---------+
|  1 | This    |
|  2 | is      |
|  3 | my      |
|  4 | name™ |
+----+---------+
4 rows in set (0.00 sec)

mysql> select group_concat(text) from test_utf8_group_concat;
+--------------------+
| group_concat(text) |
+--------------------+
| This,is,my,name™ |
+--------------------+
1 row in set (0.00 sec)

mysql>
[21 Sep 2005 18:26] Jorge del Conde
mysql> select * from test_utf8_group_concat;                 
+----+-------+
| id | text  |
+----+-------+
|  1 | This  |
|  2 | is    |
|  3 | my    |
|  4 | name  |
|  5 | name™ |
+----+-------+
5 rows in set (0.00 sec)

mysql> select group_concat(text) from test_utf8_group_concat;
+-------------------------+
| group_concat(text)      |
+-------------------------+
| This,is,my,name,nameâ
                       ¢ |
+-------------------------+
1 row in set (0.00 sec)
[21 Sep 2005 18:44] Nathan Christiansen
I should have included the following info on my server settings:

mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | latin1                     |
| character_set_database   | latin1                     |
| character_set_results    | utf8                       |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
7 rows in set (0.00 sec)
[3 May 2006 6:37] Ramil Kalimullin
I couldn't repeate the bug using the latest 4.1 tree (4.1.20).