Bug #113531 | the convert function seems not work as expected | ||
---|---|---|---|
Submitted: | 31 Dec 2023 6:52 | Modified: | 3 Jan 2024 7:41 |
Reporter: | z yz | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
Version: | 8.0.35 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[31 Dec 2023 6:52]
z yz
[3 Jan 2024 6:44]
MySQL Verification Team
Hello z yz, Thank you for the report and feedback. regards, Umesh
[3 Jan 2024 7:41]
Roy Lyseng
Posted by developer: This is actually how the MySQL CONVERT function is implemented, so it is not a bug. The UTF8 string '中文', which is 2 characters and 6 bytes wide, is interpreted as a LATIN1 string of 6 characters. This is possible, because every one-byte value is a valid LATIN1 character. Then, each of the characters is converted to UTF8, which gives a string that is 14 characters wide. The same thing happens both with connection character set as UTF8 and LATIN1, however due to display issues, it seems the result is reasonable with SET NAMES LATIN1. Note also that what you try to do is impossible: Only the lower 256 code points of a Unicode repertoire can successfully be converted to LATIN1. These UTF8 code points that each occupy 3 bytes are out of that range. If you really want to store such strings as LATIN1, consider converting them to hexadecimal notation.