| Bug #37555 | Collation utf8mb3_danish_ci shows wrong order in table if 'order by' clause used | ||
|---|---|---|---|
| Submitted: | 20 Jun 2008 19:32 | Modified: | 22 Jun 2008 15:46 |
| Reporter: | Hema Sridharan | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
| Version: | mysql-6.0-backup | OS: | Linux |
| Assigned to: | CPU Architecture: | Any | |
[20 Jun 2008 21:27]
Peter Laursen
@hema .. this is *not a bug*. In Danish and Norwegian collations the double character sequence 'aa' is identical to the special Nordic character 'å'. The 3 special Danish characters æ,ø and å are alphabetized after z like "a,b.c ... x,y,x,æ,ø,å". That said, MySQL should support both 'traditional' and 'modern' Danish collations. The meaning of 'aa' = 'å' was officially abandoned in Danish language with the language reform of 1953 (I do not know about Norwegian), but you find the old form frequently in both geographical names (example: 'Aalborg') and person surnames (example: 'Østergaard') still. And it is *very rare* in Danish to have a 'aa' sequence NOT meaning 'å' (actually I can only think about words of Frisian/Dutch/Flemish origin) Peter not a MySQL person - and Danish!)
[20 Jun 2008 21:28]
Peter Laursen
sorry .. typo! The 3 special Danish characters æ,ø and å are alphabetized after z like "a,b.c ... x,y,z,æ,ø,å".
[20 Jun 2008 21:33]
Peter Laursen
also compare with languages/collation where ä = ae ö = oe ß = ss etc ...
[20 Jun 2008 21:54]
Peter Laursen
one last comment: this is not specific for utf8mb3 character set. You get the same with every character set supporting Danish: latin1 and any unicode character set (also utf8, utf16, utf32 and ucs2). Your result is identical to: bb cc dd ee å .. and I think that explains!
[22 Jun 2008 15:46]
Sveta Smirnova
Thank you for the report. Closed as "Not a Bug" for reasons explained by Peter. See also http://www.collation-charts.org/mysql60/mysql604.utf8_danish_ci.html But there is exception in Peter's explanation: latin1_danish_ci doesn't consider 'aa' > 'bb'
[22 Jun 2008 17:06]
Peter Laursen
ok .. Sveta is correct in her last comment set names utf8; select 'aa' > 'bb' collate utf8_general_ci; -- returns 0 select 'aa' > 'bb' collate utf8_danish_ci; -- returns 1 set names latin1; select 'aa' > 'bb' collate latin1_swedish_ci; -- returns 0 select 'aa' > 'bb' collate latin1_danish_ci; -- returns 0 .. but that is *really silly* - Danish is Danish no matter the charset! I file another bug report about this!
[22 Jun 2008 17:27]
Peter Laursen
we continue here: http://bugs.mysql.com/bug.php?id=37571 :-)

Description: 1) I create database with utf8mb3 character set and collate utf8mb3_danish_ci. 2) I create tables with the same character set and collation. 3) I insert some values in to the table. 4) I perform select * from <table> order by <colname>; The order by clause doesn't show data contents in proper order for char/varchar columns only when the collation used is utf8mb3_danish_ci. create database x character set utf8mb3 collate utf8mb3_danish_ci; set names utf8mb3; use x; create table t1 (name char(20))character set utf8mb3 collate utf8mb3_danish_ci; insert into t1 values('aa'),('bb'),('cc'),('dd'),('ee'); select * from t1; select * from t1 order by name; How to repeat: mysql> create database x character set utf8mb3 collate utf8mb3_danish_ci; Query OK, 1 row affected (0.01 sec) mysql> set names utf8mb3; Query OK, 0 rows affected (0.00 sec) mysql> use x; Database changed mysql> create table t1 (name char(20))character set utf8mb3 collate utf8mb3_danish_ci; Query OK, 0 rows affected (0.01 sec) mysql> insert into t1 values('aa'),('bb'),('cc'),('dd'),('ee'); Query OK, 5 rows affected (0.01 sec) Records: 5 Duplicates: 0 Warnings: 0 mysql> select * from t1; +------+ | name | +------+ | aa | | bb | | cc | | dd | | ee | +------+ mysql> select * from t1 order by name; +------+ | name | +------+ | bb | | cc | | dd | | ee | | aa | +------+ 5 rows in set (0.00 sec)