| Bug #41895 | utf8_unicode_ci inconsistent sorting | ||
|---|---|---|---|
| Submitted: | 6 Jan 2009 13:58 | Modified: | 14 Jan 2009 9:07 |
| Reporter: | Hrvoje Novosel | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server | Severity: | S3 (Non-critical) |
| Version: | 5.1.30 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | sorting, utf8_unicode_ci | ||
[14 Jan 2009 9:07]
Sveta Smirnova
Thank you for the report. According to http://www.collation-charts.org/mysql60/mysql604.utf8_unicode_ci.european.html the letters ('c'),('C'),('č'),('Č'),('ć'),('Ć') are equal in utf8_unicode_ci. Same for ('e'), ('E') and ('t'), ('T') This means order correct in both cases. So I close the report as "Not a Bug"

Description: When sorting a varchar field with utf8_unicode_ci collation the results are different when filesort is used and when index is used. How to repeat: DROP TABLE IF EXISTS abcd; CREATE TABLE IF NOT EXISTS abcd ( letter varchar(1) COLLATE utf8_unicode_ci NOT NULL ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; INSERT INTO abcd (letter) VALUES ('c'), ('C'), ('č'), ('Č'), ('ć'), ('Ć'), ('d'), ('D'), ('đ'), ('Đ'), ('e'), ('E'), ('s'), ('S'), ('š'), ('Š'), ('t'), ('T'), ('z'), ('Z'), ('ž'), ('Ž'); executing: select * from abcd order by letter; will give: +--------+ | letter | +--------+ | c | | Ć | | ć | | Č | | č | | C | | d | | D | | đ | | Đ | | E | | e | | s | | S | | š | | Š | | T | | t | | z | | Z | | ž | | Ž | +--------+ now add an index: ALTER TABLE abcd ADD INDEX ( letter ); and select again with: select * from abcd order by letter; will give: +--------+ | letter | +--------+ | c | | C | | č | | Č | | ć | | Ć | | d | | D | | đ | | Đ | | e | | E | | s | | S | | š | | Š | | t | | T | | z | | Z | | ž | | Ž | +--------+ (Notice first 6 letters on both)