Bug #46735 GROUP_CONCAT colation or something
Submitted: 14 Aug 2009 17:43 Modified: 17 Aug 2009 17:22
Reporter: Miran Cvenkel Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.1.31-community OS:Windows
Assigned to: CPU Architecture:Any
Tags: BLOB, colation, group_concat

[14 Aug 2009 17:43] Miran Cvenkel
Description:
Critical coz I cant figure out how to do workaround.

The characters like č,ž,š gets lost, reading them in java for each of them returns '\ufffd\ufffd', the same for any of them. Seems this happens when result of GROUP_CONCAT becomes BLOB

How to repeat:

CREATE TABLE `sifrant_galery_color` (
  `id_color` int(10) unsigned NOT NULL,
  `l2` varchar(2) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `term` varchar(45) COLLATE utf8_slovenian_ci NOT NULL DEFAULT ' ',
  PRIMARY KEY (`id_color`,`l2`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_slovenian_ci;

INSERT INTO `sifrant_galery_color` (`id_color`,`l2`,`term`) VALUES 
 (1,'en','white/cream'),
 (1,'sl','bela/krem'),
 (4,'en','green'),
 (4,'sl','zelena'),
 (3,'en','Orange'),
 (3,'sl','Oranžna'),
 (2,'en','Red/violet/pink'),
 (2,'sl','Rdeča/vijolična/roza'),
 (11,'en','braun'),
 (11,'sl','rjava'),
 (6,'en','Blue'),
 (6,'sl','Modra'),
 (9,'sl','črna'),
 (9,'en','black'),
 (8,'en','yellow'),
 (8,'sl','rumena'),
 (10,'en','gray'),
 (10,'sl','siva'),
 (12,'en','transparent/glassy'),
 (12,'sl','prozorna/steklasta');

/*result of this is ok*/
SELECT GROUP_CONCAT(term) FROM sifrant_galery_color s;
/*result of this is not ok*/
SELECT GROUP_CONCAT(distinct CONCAT(term,'|',id_color)) as term FROM sifrant_galery_color
[14 Aug 2009 19:07] Miran Cvenkel
ok the solution to this is:

SELECT GROUP_CONCAT(distinct CONCAT(term,'|',CAST(id_color as CHAR))) as term FROM sifrant_galery_color
[16 Aug 2009 7:40] Sveta Smirnova
Thank you for the report.

If start mysql command line client with option --column-type-info you'll see:

mysql> SELECT GROUP_CONCAT(term) FROM sifrant_galery_color s;Field   1:  `GROUP_CONCAT(term)`Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       VAR_STRING
Collation:  utf8_general_ci (33)
Length:     1023
Max_length: 186
Decimals:   0
Flags:      

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| GROUP_CONCAT(term)                                                                                                                                                                         |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| white/cream,bela/krem,green,zelena,Orange,Oranžna,Red/violet/pink,Rdeča/vijolična/roza,braun,rjava,Blue,Modra,črna,black,yellow,rumena,gray,siva,transparent/glassy,prozorna/steklasta |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.22 sec)

Please note this is VAR_STRING with collation utf8_general_ci 

While

mysql> SELECT GROUP_CONCAT(distinct CONCAT(term,'|',id_color)) as term FROM sifrant_galery_color;
Field   1:  `term`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       BLOB
Collation:  binary (63)
Length:     1024
Max_length: 232
Decimals:   0
Flags:      BINARY 

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| term                                                                                                                                                                                                                                     |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| white/cream|1,bela/krem|1,green|4,zelena|4,Orange|3,Oranžna|3,Red/violet/pink|2,Rdeča/vijolična/roza|2,braun|11,rjava|11,Blue|6,Modra|6,črna|9,black|9,yellow|8,rumena|8,gray|10,siva|10,transparent/glassy|12,prozorna/steklasta|12 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

is BLOB with collation binary (63)

So Java has no knowledge about how it should display this string. So this is not a bug.
[17 Aug 2009 17:22] Sergei Golubchik
This is a bug in the sense that the result should not have been 'binary'.
It'll go away when the WL#2649 patch will be pushed.