Bug #55980 | Character sets: supplementary character _bin ordering is wrong | ||
---|---|---|---|
Submitted: | 13 Aug 2010 23:05 | Modified: | 26 Nov 2010 19:19 |
Reporter: | Peter Gulutzan | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
Version: | 5.5.6-m3, 5.6.0-m4 | OS: | Linux (SUSE 64-bit) |
Assigned to: | Alexander Barkov | CPU Architecture: | Any |
[13 Aug 2010 23:05]
Peter Gulutzan
[15 Aug 2010 11:01]
Sveta Smirnova
Thank you for the report. Verified as described.
[24 Aug 2010 6:34]
Alexander Barkov
The same problem is repeatable in 5.5.6-m3 (the original report was about 5.6.0-m4).
[31 Aug 2010 12:13]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/117225 3195 Alexander Barkov 2010-08-31 Bug#55980 Character sets: supplementary character _bin ordering is wrong Problem: - ORDER BY for utf8mb4_bin, utf16_bin and utf32_bin returned results in a wrong order, because old functions (supporting only BMP range) were used to handle these collations. - Additionally, utf16_bin did not sort supplementary characters between U+D700 and U+E000, as WL#1213 specification specified. mysql-test/include/ctype_filesort2.inc Adding a new shared test file include/m_ctype.h Adding prototypes mysql-test/r/ctype_utf16.result mysql-test/r/ctype_utf32.result mysql-test/r/ctype_utf8mb4.result mysql-test/t/ctype_utf16.test mysql-test/t/ctype_utf32.test mysql-test/t/ctype_utf8mb4.test Adding tests strings/ctype-ucs2.c - Fixing my_strncoll[sp]_utf16_bin to compare binary representation instead of code points, to make columns with indexes sort correct. - Fixing my_collation_handler_utf32_bin and my_collation_handler_utf16_bin to use new functions strings/ctype-utf8.c - Adding my_strnxfrm[len]_unicode_fill_bin() to handle utf8mb4_bin, utf16_bin and utf32_bin, using 3 bytes per weight. This function also performs special reordering in case of utf16_bin. - Fixing my_collation_utf8mb4_bin handler to use the new function.
[31 Aug 2010 13:55]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/117256 3197 Alexander Nozdrin 2010-08-31 Bug#55980 Character sets: supplementary character _bin ordering is wrong Problem: - ORDER BY for utf8mb4_bin, utf16_bin and utf32_bin returned results in a wrong order, because old functions (supporting only BMP range) were used to handle these collations. - Additionally, utf16_bin did not sort supplementary characters between U+D700 and U+E000, as WL#1213 specification specified. @ include/m_ctype.h Adding prototypes. @ mysql-test/include/ctype_filesort2.inc Adding a new shared test file. @ mysql-test/t/ctype_utf8mb4.test Adding tests. @ strings/ctype-ucs2.c - Fixing my_strncoll[sp]_utf16_bin to compare binary representation instead of code points, to make columns with indexes sort correct. - Fixing my_collation_handler_utf32_bin and my_collation_handler_utf16_bin to use new functions. @ strings/ctype-utf8.c - Adding my_strnxfrm[len]_unicode_fill_bin() to handle utf8mb4_bin, utf16_bin and utf32_bin, using 3 bytes per weight. This function also performs special reordering in case of utf16_bin. - Fixing my_collation_utf8mb4_bin handler to use the new function.
[31 Aug 2010 14:22]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/117262 3086 Alexander Nozdrin 2010-08-31 Cherry-picking patch for Bug#55980. Original changeset: ------------------------------------------------------------ revno: 3197 revision-id: alik@sun.com-20100831135426-h5a4s2w6ih1d8q2x parent: magnus.blaudd@sun.com-20100830120632-u3xzy002mdwueli8 committer: Alexander Nozdrin <alik@sun.com> branch nick: mysql-5.5-bugfixing timestamp: Tue 2010-08-31 17:54:26 +0400 message: Bug#55980 Character sets: supplementary character _bin ordering is wrong Problem: - ORDER BY for utf8mb4_bin, utf16_bin and utf32_bin returned results in a wrong order, because old functions (supporting only BMP range) were used to handle these collations. - Additionally, utf16_bin did not sort supplementary characters between U+D700 and U+E000, as WL#1213 specification specified. ------------------------------------------------------------
[1 Sep 2010 6:51]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/117288 3250 Alexander Barkov 2010-09-01 [merge] Merging Bug#55980 from mysql-5.5-bugfixing and applying "WL#3664 strnxfrm() changes for prefix keys and NOPAD" related changes.
[1 Sep 2010 7:07]
Alexander Barkov
See also: Bug#37244 Character sets: short utf8_bin weight_string value
[1 Sep 2010 7:50]
Alexander Barkov
Pushed into mysql-5.5-bugfixing [5.5.6-m3] Pushed into mysql-trunk-bugfixing [5.6.1-m4] Pushed into mysql-next-mr-bugfixing [5.6.99-m5]
[10 Sep 2010 18:52]
Bugs System
Pushed into mysql-5.5 5.5.7-rc (revid:joerg@mysql.com-20100910184813-csdto6tk4nlogrsq) (version source revid:davi.arnaut@oracle.com-20100831142822-2qhufn3hho4xqr4p) (merge vers: 5.5.7-m3) (pib:21)
[13 Sep 2010 10:05]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/118059 3194 Dmitry Lenev 2010-09-13 [merge] Null-merge fix for bug#55980 from mysql-5.5.6-m3-release tree into mysql-trunk tree. Proper version of the fix for this tree will come from mysql-trunk-bugfixing.
[13 Sep 2010 13:50]
Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:dlenev@mysql.com-20100913103627-p2oqplu42x1gv2bd) (version source revid:dlenev@mysql.com-20100913100411-2qdg15bp0qu98ce5) (merge vers: 5.6.1-m4) (pib:21)
[13 Sep 2010 13:52]
Bugs System
Pushed into mysql-next-mr (revid:dlenev@mysql.com-20100913121556-sfxqlpj9kbc28kaf) (version source revid:davi.arnaut@oracle.com-20100831142822-2qhufn3hho4xqr4p) (pib:21)
[24 Sep 2010 19:35]
Paul DuBois
Noted in 5.5.7, 5.6.1 changelogs. The ordering for supplementary characters with the utf8mb4_bin, utf16_bin, and utf32_bin collations was incorrect.
[2 Oct 2010 18:13]
Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alexander.nozdrin@oracle.com-20101002180948-852x1cuv7c6i85ea) (version source revid:alexander.nozdrin@oracle.com-20101002180857-an32jpuwzemsp4f2) (merge vers: 5.6.1-m4) (pib:21)
[2 Oct 2010 18:14]
Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101002181053-6iotvl26uurcoryp) (version source revid:alexander.nozdrin@oracle.com-20101002180917-h0n62akupm3z20nt) (pib:21)
[2 Oct 2010 18:16]
Bugs System
Pushed into mysql-5.5 5.5.7-rc (revid:alexander.nozdrin@oracle.com-20101002180831-590ka2tuit9qoxbb) (version source revid:alexander.nozdrin@oracle.com-20101002180831-590ka2tuit9qoxbb) (merge vers: 5.5.7-rc) (pib:21)
[24 Nov 2010 15:45]
Alexander Barkov
The patch reverting the change about supplementary characters in utf16.
Attachment: b55980-revert.diff (text/x-patch), 2.03 KiB.
[24 Nov 2010 15:48]
Alexander Barkov
A patch reverting the change about the order of supplementary characters has been applied. See the patch in the "files" section. Let utf16_bin be "code point" order, according to this manual section: http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-sets.html
[26 Nov 2010 19:19]
Peter Gulutzan
It has been decided that the described behaviour is correct -- with utf16_bin ordering should be be code point, not byte by byte. So this is not a bug.
[3 Dec 2010 9:33]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/125906 3408 Alexander Barkov 2010-12-03 Bug#55980 Reverting the "utf16_bin is byte-by-byte" patch. This reverting patch is actually already in mysql-5.5-security, which will be merged to mysql-trunk-* later this month. But I need this reverting patch now, as pre-requisite for WL#4616. Applying it to mysql-trunk-bugfixing manually, not to wait for merge.
[5 Dec 2010 12:39]
Bugs System
Pushed into mysql-trunk 5.6.1 (revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (version source revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (merge vers: 5.6.1) (pib:23)
[16 Dec 2010 21:47]
Bugs System
Pushed into mysql-trunk 5.6.1 (revid:alexander.nozdrin@oracle.com-20101216181820-7afubgk2fmuv9qsb) (version source revid:alexander.nozdrin@oracle.com-20101216173826-ze3y5h450sksotrh) (merge vers: 5.6.1) (pib:23)
[16 Dec 2010 22:28]
Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)