Bug #53064 garbled data when using utf8_german2_ci collation
Submitted: 22 Apr 2010 13:57 Modified: 4 Aug 2010 22:40
Reporter: Bernt Marius Johnsen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:next-mr-bar (2010-04-22) OS:Linux
Assigned to: CPU Architecture:Any

[22 Apr 2010 13:57] Bernt Marius Johnsen
Description:
Data get garbled when using utf8_german2_ci collation, but not in all contexts. Seems to be some uninitailized data somewhere. 

The following SQL reporduces the problem:

drop table if exists t1;
create table t1 (s1 varchar(10) collate utf16_german2_ci);
insert into t1 values ('a'),('ae'),('af'),(_utf16 0x00e4);
select s1,hex(s1),hex(weight_string(s1)) from t1 order by s1;

drop table if exists t1;
create table t1 (s1 varchar(10) collate utf8_german2_ci);
insert into t1 values ('a'),('ae'),('af'),(_utf16 0x00e4);
select s1,hex(s1),hex(weight_string(s1)) from t1 order by s1;

Gives the following result:

+------+----------+------------------------+
| s1   | hex(s1)  | hex(weight_string(s1)) |
+------+----------+------------------------+
| a    | 0061     | 0E33                   |
| ae   | 00610065 | 0E330E8B               |
| ä    | 00E4     | 0E330E8B               |
| af   | 00610066 | 0E330EB9               |
+------+----------+------------------------+
+-------+------------+------------------------+
| s1    | hex(s1)    | hex(weight_string(s1)) |
+-------+------------+------------------------+
| aa f  | 61610066   | 0E330E330EB9           |
| ä     | C3A4       | 0E330E8B               |
| ae f  | 6165006600 | 0E330E8B0EB9           |
| af f  | 6166006600 | 0E330EB90EB9           |
+-------+------------+------------------------+

How to repeat:
See description
[23 Apr 2010 19:56] Sveta Smirnova
Thank you for the report.

Verified as described.
[5 May 2010 14:23] Alexander Barkov
The same problem is repeatable with just the utf8 part:

drop table if exists t1;
create table t1 (s1 varchar(10) collate utf8_german2_ci);
insert into t1 values ('a'),('ae'),('af'),(_utf16 0x00e4);
select s1,hex(s1),hex(weight_string(s1)) from t1 order by s1;
[5 May 2010 14:24] Alexander Barkov
A more simplified test repeating the same problem:

drop table if exists t1;
create table t1 (s1 varchar(10) collate utf8_german2_ci);
insert into t1 values ('a'),('ae'),('af');
select s1,hex(s1),hex(weight_string(s1)) from t1 order by s1;
[5 May 2010 14:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107541

3123 Alexander Barkov	2010-05-05
      Bug#53064 garbled data when using utf8_german2_ci collation
      
      Problem: mbminlen for my_charset_utf8_german2_ci was set to 3 in a mistake
      Fix: changing to mbminlen to 1.
[21 Jun 2010 6:12] Bugs System
Pushed into mysql-next-mr (revid:bar@mysql.com-20100617072236-vx5eqygof70izuho) (version source revid:bar@mysql.com-20100617072236-vx5eqygof70izuho) (pib:16)
[4 Aug 2010 8:10] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@ibmvm-20100804080001-bny5271e65xo34ig) (version source revid:bar@mysql.com-20100617072236-vx5eqygof70izuho) (merge vers: 5.6.99-m4) (pib:18)
[4 Aug 2010 8:25] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@ibmvm-20100804081533-c1d3rbipo9e8rt1s) (version source revid:bar@mysql.com-20100617072236-vx5eqygof70izuho) (merge vers: 5.6.99-m4) (pib:18)
[4 Aug 2010 22:40] Paul DuBois
Bug does not appear in any released 5.6.x version.