Bug #39416 mysql CLI cannot handle multi-byte character correctly
Submitted: 12 Sep 2008 5:21 Modified: 12 Sep 2008 7:29
Reporter: Mikiya Okuno Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Command-line Clients Severity:S3 (Non-critical)
Version:Any OS:Any
Assigned to: CPU Architecture:Any

[12 Sep 2008 5:21] Mikiya Okuno
Description:
It seems that "mysql" command line client cannot handle multi-byte character correctly. Even "SET NAMES" command does not take effect.

How to repeat:
mysql> CREATE TABLE t1 (a varchar(100)) CHARACTER SET utf8;

mysql> SET NAMES latin1;
mysql> INSERT INTO t1 VALUES('担保');
Query OK, 1 row affected (0.00 sec)

This could insert a new row without any errors, but the character could be garbled.

mysql> SET NAMES utf8;
mysql> INSERT INTO t1 VALUES('担保');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+---------+------+------------------------------------------------------------+
| Level   | Code | Message                                                    |
+---------+------+------------------------------------------------------------+
| Warning | 1366 | Incorrect string value: '\xCA\xDD' for column 'a' at row 1 | 
+---------+------+------------------------------------------------------------+
1 row in set (0.00 sec)

The error happens and MySQL Server complains that the characters are not acceptable for rows. You'll find that the row inserted using latin1 is garbled, and the latter row is partially inserted.

mysql> SELECT * FROM t1;
+----------+
| a        |
+----------+
| 〓〓卒〓〓〓〓 | 
| 担       | 
+----------+
3 rows in set (0.00 sec)

However, I can insert the identical value using Query Browser. The existing rows and a good row look like the following from the query browser.

ô
ôÊÝ
担保
[12 Sep 2008 7:29] Susanne Ebrecht
Mikiya,

this is not a bug.

You have to tell the system which encoding your environment is using.

Therefor you have to change the variables character_set_client and character_set_result and character_set_connection to the encoding value of your environment. You can do it by using the SET NAMES command. If we don't have a matching character set for your environment encoding you have to change the environment encoding to a matching value and set the variables like this encoding.

Generally, MySQL only support 3 byte UTF8 at the moment. There is a worklog for implementing 4 byte UTF8 for MySQL 6.0.

Latin1 is only for Western Europe signs and don't support Asia signs at all.