Bug #70372 | UTF-8 Character causes mysql to lose data | ||
---|---|---|---|
Submitted: | 17 Sep 2013 19:31 | Modified: | 18 Sep 2013 15:14 |
Reporter: | Chad Thomas | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S1 (Critical) |
Version: | 5.5.31-0+wheezy1 | OS: | Linux (Debian) |
Assigned to: | CPU Architecture: | Any |
[17 Sep 2013 19:31]
Chad Thomas
[17 Sep 2013 19:43]
Chad Thomas
Test SQL file
Attachment: utf8-test-database_2013-09-17.sql (application/octet-stream, text), 1.24 KiB.
[17 Sep 2013 19:45]
Chad Thomas
It looks like the bug tracker is effected by this bug as well, I pasted in steps to reproduce earlier in the ticket and my report got cut off after It hit a unicode character.
[17 Sep 2013 19:45]
Chad Thomas
I tested on: 5.5.31-0+wheezy1
[17 Sep 2013 21:22]
Peter Laursen
@Chad .. a tip: attach your SQL test case as a plain text file! HTML formatting sometimes corrupts (if there is a conflictg with HTML/XML characters that need to be encoded of if using > or characters that define a HTML tag (> or < )! -- Peter (not a MySQL/Oracle person)
[17 Sep 2013 21:45]
MySQL Verification Team
Please try utf8mb4 instead of utf8: http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html . Thanks.
[17 Sep 2013 22:07]
Chad Thomas
Updated test case
Attachment: utf8-test-database_2013-09-17.sql (application/octet-stream, text), 1.24 KiB.
[17 Sep 2013 22:09]
Chad Thomas
utf8mb4 does in fact work. Should the default functionality of utf8 be to drop all characters after a utf8 character is found?
[17 Sep 2013 22:09]
Chad Thomas
OR rather a utf8 character with 3 or more bytes.
[18 Sep 2013 15:14]
MySQL Verification Team
Thank you for the feedback. According mentioned in the Manual UTF8 uses a maximum of three bytes per character and contains only BMP characters. If you need beyond that then use utf8mb4 character set. Thanks.