MySQL Bugs: #41895: utf8_unicode_ci inconsistent sorting

Bug #41895	utf8_unicode_ci inconsistent sorting
Submitted:	6 Jan 2009 13:58	Modified:	14 Jan 2009 9:07
Reporter:	Hrvoje Novosel	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Server	Severity:	S3 (Non-critical)
Version:	5.1.30	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	sorting, utf8_unicode_ci

Description:
When sorting a varchar field with utf8_unicode_ci collation the results are different when filesort is used and when index is used.

How to repeat:
DROP TABLE IF EXISTS abcd;
CREATE TABLE IF NOT EXISTS abcd (
  letter varchar(1) COLLATE utf8_unicode_ci NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

INSERT INTO abcd (letter) VALUES
('c'),
('C'),
('č'),
('Č'),
('ć'),
('Ć'),
('d'),
('D'),
('đ'),
('Đ'),
('e'),
('E'),
('s'),
('S'),
('š'),
('Š'),
('t'),
('T'),
('z'),
('Z'),
('ž'),
('Ž');

executing: select * from abcd order by letter;
will give:
+--------+
| letter |
+--------+
| c      |
| Ć      |
| ć      |
| Č      |
| č      |
| C      |
| d      |
| D      |
| đ      |
| Đ      |
| E      |
| e      |
| s      |
| S      |
| š      |
| Š      |
| T      |
| t      |
| z      |
| Z      |
| ž      |
| Ž      |
+--------+

now add an index:  ALTER TABLE abcd ADD INDEX ( letter );
and select again with: select * from abcd order by letter;
will give:
+--------+
| letter |
+--------+
| c      |
| C      |
| č      |
| Č      |
| ć      |
| Ć      |
| d      |
| D      |
| đ      |
| Đ      |
| e      |
| E      |
| s      |
| S      |
| š      |
| Š      |
| t      |
| T      |
| z      |
| Z      |
| ž      |
| Ž      |
+--------+

(Notice first 6 letters on both)

Thank you for the report.

According to http://www.collation-charts.org/mysql60/mysql604.utf8_unicode_ci.european.html the letters ('c'),('C'),('č'),('Č'),('ć'),('Ć') are equal in utf8_unicode_ci. Same for ('e'), ('E') and ('t'), ('T') This means order correct in both cases. So I close the report as "Not a Bug"