Bug #20247 | Incorrect sorting of Lithuanian national chars | ||
---|---|---|---|
Submitted: | 3 Jun 2006 18:17 | Modified: | 22 Sep 2006 8:06 |
Reporter: | Algirdas Brazas | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S4 (Feature request) |
Version: | 5.0.21 | OS: | Linux (Linux Slackware) |
Assigned to: | Domas Mituzas | CPU Architecture: | Any |
[3 Jun 2006 18:17]
Algirdas Brazas
[3 Jun 2006 18:20]
Algirdas Brazas
SQL table and data for lithuanian alphabeth
Attachment: alphabet.sql (text/plain), 1.01 KiB.
[13 Jun 2006 10:31]
Domas Mituzas
According to VLKK (State Lithuanian Language Bureau) the 'dictionary' ordering specifies that extended vowel forms are sorted together (as secondary weight). We will analyze if there're double standards for Lithuanian sorting and introduce proper consistency with them.
[15 Aug 2006 16:38]
MySQL Verification Team
Also see bug: http://bugs.mysql.com/bug.php?id=21581
[22 Sep 2006 7:49]
Domas Mituzas
I have LST 1285:1993 in front of me, and it defines such order: AĄ aą B b C c Č č D d EĘĖ eęė F f G g H h IĮY iįy J j K k L l M m N n O o P p Q q R r S s Š š T t UŲŪ uųū V v W w X x Z z Ž ž So, accented vowels are always grouped and sorted together. Unique words, that differ only in accents, should be treated as homographs, and schema would be adjusted for that. In specialized systems canonized forms can be used together with binary collations or data types.
[22 Sep 2006 8:06]
Domas Mituzas
Not a bug: utf8_lithuanian_ci strictly follows LST 1285:1993 and VLKK recommendations.