Bug #25410 Sorting error in latin1_danish_ci
Submitted: 4 Jan 2007 11:50 Modified: 11 Jan 2007 16:54
Reporter: Johan Idrén Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Charsets Severity:S2 (Serious)
Version:5.0.30-enterprise-gpl-log, 5.1 BK OS:MacOS (Mac OS X 10.4.8)
Assigned to: CPU Architecture:Any

[4 Jan 2007 11:50] Johan Idrén
Description:
When sorting danish characters, å gets sorted before æ, this is not correct.

The correct sort-order is a b c d e f g h i j k l m n o p q r s t u v w x y z æ ø å
Appears to be a b c d e f g h i j k l m n o p q r s t u v w x y z å æ ø (like swedish)

How to repeat:
mysql> create table coll (a char(15)) DEFAULT CHARSET=latin1 COLLATE=latin1_danish_ci;
Query OK, 0 rows affected (0.07 sec)

mysql> INSERT INTO coll VALUES('æ'),('ø'),('å');
Query OK, 3 rows affected (0.03 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM coll ORDER BY a;
+------+
| a    |
+------+
| å   | 
| æ   | 
| ø   | 
+------+
3 rows in set (0.00 sec)

mysql> show variables like 'coll%';
+----------------------+------------------+
| Variable_name        | Value            |
+----------------------+------------------+
| collation_connection | latin1_danish_ci | 
| collation_database   | utf8_general_ci  | 
| collation_server     | utf8_general_ci  | 
+----------------------+------------------+
3 rows in set (0.00 sec)

Suggested fix:
Change sort order
[4 Jan 2007 11:59] Johan Idrén
Source of proper sorting:
http://en.wikipedia.org/wiki/Danish_alphabet
http://en.wikipedia.org/wiki/Norwegian_language#The_alphabet
[10 Jan 2007 0:54] Peter Gulutzan
This is what I get:

mysql> SELECT VERSION();
+--------------+
| VERSION()    |
+--------------+
| 5.0.34-debug |
+--------------+
1 row in set (0.00 sec)

mysql> SET NAMES UTF8;
Query OK, 0 rows affected (0.00 sec)

mysql> create table coll (a char(15)) DEFAULT CHARSET=latin1 COLLATE=latin1_danish_ci;
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO coll VALUES('æ'),('ø'),('å');
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM coll ORDER BY a;
+------+
| a    |
+------+
| æ   |
| ø   |
| å   |
+------+
3 rows in set (0.00 sec)

So I'm mystified how this bug got verified.
Perhaps it's something to do with MAC OS,
or version, or character set of connection.

I have put this back to "Open". Could somebody, e.g. Miguel, please re-verify?
[10 Jan 2007 7:14] Sveta Smirnova
Thank you for the report.

Verified as described on Intel Mac using last BK sources.

Version 5.1 is affected too.
[11 Jan 2007 11:11] Sveta Smirnova
Test

Attachment: bug25410.test (application/octet-stream, text), 226 bytes.

[11 Jan 2007 11:11] Sveta Smirnova
Result file

Attachment: bug25410.result (application/octet-stream, text), 228 bytes.

[11 Jan 2007 11:12] Sveta Smirnova
Test and wrong result showing wrong sort order are attached.
[11 Jan 2007 11:23] Sveta Smirnova
Result file with corrected connection_collation

Attachment: bug25410.result (application/octet-stream, text), 457 bytes.

[11 Jan 2007 11:24] Sveta Smirnova
Test with corrected connection collation

Attachment: bug25410.test (application/octet-stream, text), 327 bytes.

[11 Jan 2007 16:54] Sveta Smirnova
There is error in the original CREATE TABLE statement. Code below shows difference:

--disable_warnings
DROP TABLE IF EXISTS t1;
--enable_warnings

set names latin1;

set @@collation_connection=latin1_danish_ci;

create table t1 (s1 char(5) character set latin1) COLLATE=latin1_danish_ci;

insert into t1 values ('æ'),('ø'),('å');

select s1,hex(s1),length(s1) from t1 order by s1;

select s1,hex(s1),length(s1) from t1 order by s1 collate latin1_danish_ci; 

select COLLATION_NAME from information_schema.columns where table_name = 't1';

show variables like 'coll%';