Bug #13421 | problem with sorting turkish | ||
---|---|---|---|
Submitted: | 23 Sep 2005 10:14 | Modified: | 9 Dec 2005 0:14 |
Reporter: | Peter Lemmens | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | mysql 4.1.14 | OS: | Windows (Windows 2000 Server) |
Assigned to: | Alexander Barkov | CPU Architecture: | Any |
[23 Sep 2005 10:14]
Peter Lemmens
[23 Sep 2005 10:25]
Vasily Kishkin
Thanks for the bug report. I was able to reproduce the bug. I attached screen copy of result.
[23 Sep 2005 10:25]
Vasily Kishkin
screen copy
Attachment: screencopy.JPG (image/jpeg, text), 56.23 KiB.
[28 Sep 2005 7:18]
Alexander Barkov
Peter, you're right, there is something wrong with Turkish collation for Latin5. We need your suggestion for what these queries should return: SELECT "a"= "â", "a" < "â"; SELECT "i"="î", "i" < "î"; SELECT "u" = "û", "u" < "û"; SELECT "û" = "ü", "û" < "ü"; Also, how should other accented letters (which are part of latin5 but not used by Tuskish alphabet) be sorted and compated? Should the be considered equal to base un-accented letter? For example: SELECT "a" = "ä", "a" < "ä"; SELECT "u" = "ù", "u" < "ù"; etc. Thanks! P.S. I believe utf8_turkish_ci works fine. We can fix latin5_turkish_ci to produce about the same sorting order. What do you think?
[30 Sep 2005 21:17]
Peter Lemmens
When looking at turkish dictionaries you will see the words sorted like this : hala, hâlâ, ham The a is considered kinda equal to â but the a still comes before the â when sorting the same word. So based on this the query SELECT "a" = "ä", "a" < "â" should return 1 1 SELECT "a"= "â", "a" < "â"; > 1 1 SELECT "i"="î", "i" < "î"; > 1 1 SELECT "u" = "û", "u" < "û"; > 1 1 SELECT "û" = "ü", "û" < "ü"; > 0 1 I tried the utf8_turkish_ci but i was unable to insert both "hala" and "hâlâ". They were considered duplicate. I, personally, would treat the other accented letters the same way as â etc.
[25 Oct 2005 7:32]
Alexander Barkov
The change has been commited: http://lists.mysql.com/internals/31429
[7 Dec 2005 14:55]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/11
[7 Dec 2005 15:46]
Alexander Barkov
Fixed in 4.1.17 and 5.0.17. A change log commend suggestion: Changed order of the letters "A WITH CIRCUMFLEX", "I WITH CIRCUMLEX" and "U WITH CIRCUMFLEX" in latin5_turkish_ci collation. Those who have used these letters must rebuild indexes.
[9 Dec 2005 0:14]
Paul DuBois
Noted in 4.1.17, 5.0.17 changelogs.