Bug #28855 UTF-8 charsets not sorted correctly
Submitted: 3 Jun 2007 4:20 Modified: 4 Jul 2007 5:28
Reporter: Rodulfo Araujo Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.0.27-standard-log OS:Any
Assigned to: CPU Architecture:Any

[3 Jun 2007 4:20] Rodulfo Araujo
Description:
Correct sorting of these countries (Frech Names) is:
Afghanistan
Åland, Îles
Bahreïn
Égypte

However, in a UTF-8 table they are stored as:
Afghanistan
Ã…land, ÃŽles
Bahreïn
Égypte

After sorting the result is (varies slightly from one UTF8 collation to another, but the idea is the same, this is w/utf8_spanish_ci):
Ã…land, ÃŽles
Égypte
Afghanistan
Bahreïn

Server version: 5.0.27-standard-log
Protocol version: 10
Server: Localhost via UNIX socket
MySQL charset: UTF-8 Unicode (utf8)

MySQL client version: 4.1.10
Used PHP extensions: mysql

How to repeat:
CREATE TABLE country
(country varchar(100) NOT NULL,
UNIQUE KEY (country)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_spanish_ci;
INSERT INTO country (country) VALUES ('Afghanistan');
INSERT INTO country (country) VALUES ('Ã…land, ÃŽles');
INSERT INTO country (country) VALUES ('Bahreïn');
INSERT INTO country (country) VALUES ('Égypte');
SELECT country FROM country ORDER BY country;
[4 Jun 2007 5:28] Valeriy Kravchuk
Thank you for a problem report. Please, try to repeat with a newer version, 5.0.41, and inform about the results. In case of the same problem, please, send the results of:

show variables like 'coll%';

from the same environment.
[4 Jul 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".