Bug #51976 | LDML collations issue (cyrillic example) | ||
---|---|---|---|
Submitted: | 12 Mar 2010 7:06 | Modified: | 18 Jun 2010 1:13 |
Reporter: | Alexandr Evstigneev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S1 (Critical) |
Version: | 5.5.2-m2, 5.1.29-rc-log, 5.1, 5.5.99 bzr | OS: | FreeBSD (7.0. 64bit) |
Assigned to: | Alexander Barkov | CPU Architecture: | Any |
Tags: | collation, cyrillic, LDML |
[12 Mar 2010 7:06]
Alexandr Evstigneev
[12 Mar 2010 7:21]
Susanne Ebrecht
Many thanks for writing a bug report. I am not able to repeat this by using MySQL 5.1.45. Are you sure the data are stored correct in your database? Please provide output from SELECT a, length(a), hex(a) FROM _example;
[12 Mar 2010 7:28]
Alexandr Evstigneev
5.5.2-m2 mysql> SELECT a, lcase(a), length(a), hex(a) FROM _example; +----+----------+-----------+--------+ | a | lcase(a) | length(a) | hex(a) | +----+----------+-----------+--------+ | а | | 2 | D0B0 | | б | | 2 | D0B1 | | в | | 2 | D0B2 | | г | | 2 | D0B3 | | д | | 2 | D0B4 | | е | | 2 | D0B5 | | ё | | 2 | D191 | | ж | | 2 | D0B6 | | з | | 2 | D0B7 | | и | | 2 | D0B8 | | й | | 2 | D0B9 | | к | | 2 | D0BA | | л | | 2 | D0BB | | м | | 2 | D0BC | | н | | 2 | D0BD | | о | | 2 | D0BE | | п | | 2 | D0BF | | р | | 2 | D180 | | с | | 2 | D181 | | т | | 2 | D182 | | у | | 2 | D183 | | ф | | 2 | D184 | | х | | 2 | D185 | | ц | | 2 | D186 | | ч | | 2 | D187 | | ш | | 2 | D188 | | щ | | 2 | D189 | | ь | | 2 | D18C | | ы | | 2 | D18B | | ъ | | 2 | D18A | | э | | 2 | D18D | | ю | | 2 | D18E | | я | | 2 | D18F | | А | | 2 | D090 | | Б | | 2 | D091 | | В | | 2 | D092 | | Г | | 2 | D093 | | Д | | 2 | D094 | | Е | | 2 | D095 | | Ё | | 2 | D081 | | Ж | | 2 | D096 | | З | | 2 | D097 | | И | | 2 | D098 | | Й | | 2 | D099 | | К | | 2 | D09A | | Л | | 2 | D09B | | М | | 2 | D09C | | Н | | 2 | D09D | | О | | 2 | D09E | | П | | 2 | D09F | | Р | | 2 | D0A0 | | С | | 2 | D0A1 | | Т | | 2 | D0A2 | | У | | 2 | D0A3 | | Ф | | 2 | D0A4 | | Х | | 2 | D0A5 | | Ц | | 2 | D0A6 | | Ч | | 2 | D0A7 | | Ш | | 2 | D0A8 | | Щ | | 2 | D0A9 | | Ь | | 2 | D0AC | | Ы | | 2 | D0AB | | Ъ | | 2 | D0AA | | Э | | 2 | D0AD | | Ю | | 2 | D0AE | | Я | | 2 | D0AF | +----+----------+-----------+--------+ 66 rows in set (0.00 sec)
[12 Mar 2010 7:59]
Alexandr Evstigneev
5.1.29 crushes with query SELECT a, lcase(a), length(a), hex(a) FROM _example; All tables marked as damaged. Here is SELECT a, length(a), hex(a) FROM _example; mysql> SELECT a, length(a), hex(a) FROM _example; +------+-----------+--------+ | a | length(a) | hex(a) | +------+-----------+--------+ | а | 2 | D0B0 | | б | 2 | D0B1 | | в | 2 | D0B2 | | г | 2 | D0B3 | | д | 2 | D0B4 | | е | 2 | D0B5 | | ё | 2 | D191 | | ж | 2 | D0B6 | | з | 2 | D0B7 | | и | 2 | D0B8 | | й | 2 | D0B9 | | к | 2 | D0BA | | л | 2 | D0BB | | м | 2 | D0BC | | н | 2 | D0BD | | о | 2 | D0BE | | п | 2 | D0BF | | р | 2 | D180 | | с | 2 | D181 | | т | 2 | D182 | | у | 2 | D183 | | ф | 2 | D184 | | х | 2 | D185 | | ц | 2 | D186 | | ч | 2 | D187 | | ш | 2 | D188 | | щ | 2 | D189 | | ь | 2 | D18C | | ы | 2 | D18B | | ъ | 2 | D18A | | э | 2 | D18D | | ю | 2 | D18E | | я | 2 | D18F | | А | 2 | D090 | | Б | 2 | D091 | | В | 2 | D092 | | Г | 2 | D093 | | Д | 2 | D094 | | Е | 2 | D095 | | Ё | 2 | D081 | | Ж | 2 | D096 | | З | 2 | D097 | | И | 2 | D098 | | Й | 2 | D099 | | К | 2 | D09A | | Л | 2 | D09B | | М | 2 | D09C | | Н | 2 | D09D | | О | 2 | D09E | | П | 2 | D09F | | Р | 2 | D0A0 | | С | 2 | D0A1 | | Т | 2 | D0A2 | | У | 2 | D0A3 | | Ф | 2 | D0A4 | | Х | 2 | D0A5 | | Ц | 2 | D0A6 | | Ч | 2 | D0A7 | | Ш | 2 | D0A8 | | Щ | 2 | D0A9 | | Ь | 2 | D0AC | | Ы | 2 | D0AB | | Ъ | 2 | D0AA | | Э | 2 | D0AD | | Ю | 2 | D0AE | | Я | 2 | D0AF | +------+-----------+--------+ 66 rows in set (0.08 sec)
[12 Mar 2010 8:44]
Susanne Ebrecht
Did you also change mysys/charset-def.c and config/ac-macros/character_sets.m4? Did you re-compile the code after adding the collation? The full instruction how to add a new charset and/or new collation you will find here: http://dev.mysql.com/doc/refman/5.1/en/adding-character-set.html
[12 Mar 2010 8:51]
Alexandr Evstigneev
No i didn't. Because i'm not adding a charset, only collation. And manual about collations plainly says: "UCA collations for Unicode character sets can be added to MySQL without recompiling by using a subset of the Locale Data Markup Language (LDML),"
[12 Mar 2010 9:39]
Susanne Ebrecht
Please provide the xml file.
[12 Mar 2010 9:46]
Alexandr Evstigneev
full charsets/Index.xml with utf8_russian_ci collation added
Attachment: Index.xml (text/xml), 18.55 KiB.
[12 Mar 2010 9:46]
Alexandr Evstigneev
Uploaded to files.
[14 Mar 2010 9:19]
Sveta Smirnova
Thank you for the feedback. Crash is only repeatable with old 5.1 versions, current 5.1 returns set of empty strings as and 5.5 series. Wrong results verified as described.
[14 Mar 2010 9:56]
Alexandr Evstigneev
WinXP 5.1.44 got the same problem - result is empty.
[15 Mar 2010 7:10]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/103172 3400 Alexander Barkov 2010-03-15 Bug #51976 LDML collations issue Problem: caseup_multiply and casedb_multiply members where not initialized for a dynamic collation, so UPPER() and LOWER() functions returned empty strings. Fix: initializing the members properly.
[22 Mar 2010 12:36]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/103975 3410 Alexander Barkov 2010-03-22 Bug #51976 LDML collations issue Problem: caseup_multiply and casedn_multiply members were not initialized for a dynamic collation, so UPPER() and LOWER() functions returned empty strings. Fix: initializing the members properly. Adding tests: mysql-test/r/ctype_ldml.result mysql-test/t/ctype_ldml.test Applying the fix: mysys/charset.c
[22 Mar 2010 13:17]
Alexander Barkov
Pushed into mysql-5.1-bugteam (5.1.46) Pushed into mysql-pe (6.0.14-alpha)
[26 Mar 2010 8:21]
Bugs System
Pushed into 5.5.4-m3 (revid:alik@sun.com-20100326080914-2pz8ns984e0spu03) (version source revid:alexey.kopytov@sun.com-20100322132851-8j3m42x4ldi1kca5) (merge vers: 5.5.3-m2) (pib:16)
[26 Mar 2010 8:25]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100326081116-m3v4l34yhr43mtsv) (version source revid:alik@sun.com-20100325072612-4sds00ix8ajo1e84) (pib:16)
[26 Mar 2010 8:30]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100326081944-qja07qklw1p2w7jb) (version source revid:alik@sun.com-20100325073410-4t4i9gu2u1pge7xb) (merge vers: 6.0.14-alpha) (pib:16)
[6 Apr 2010 7:58]
Bugs System
Pushed into 5.1.46 (revid:sergey.glukhov@sun.com-20100405111026-7kz1p8qlzglqgfmu) (version source revid:bar@mysql.com-20100322122759-97i1u39pndttjde2) (merge vers: 5.1.46) (pib:16)
[12 Apr 2010 22:12]
Paul DuBois
Noted in 5.1.46, 5.5.5, 6.0.14 changelogs. For LDML-defined collations, some data structures were not initialized properly to enable UPPER() and LOWER() to work correctly.
[17 Jun 2010 11:56]
Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:35]
Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:martin.skold@mysql.com-20100609211156-tsac5qhw951miwtt) (merge vers: 5.1.46-ndb-6.2.19) (pib:16)
[17 Jun 2010 13:22]
Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)