| Bug #4988 | Japanese text gets truncated when inserted into utf8 column | ||
|---|---|---|---|
| Submitted: | 11 Aug 2004 10:03 | Modified: | 25 Aug 2004 6:14 |
| Reporter: | Joseph Waller | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server | Severity: | S2 (Serious) |
| Version: | 4.1.3b-beta | OS: | Windows (Windows XP SP1) |
| Assigned to: | Alexander Barkov | CPU Architecture: | Any |
[11 Aug 2004 21:00]
Shuichi Tamagawa
I have the same problem on both Win XP SP1 and SuSE Linux 9.0.
[23 Aug 2004 2:50]
Joseph Waller
Any luck finding a work-around?
[24 Aug 2004 11:04]
Alexander Barkov
What client program do you use?
Can you please run this query:
SELECT HEX('the same string with Japanese characters you're trying to insert');
Thank you!
[25 Aug 2004 3:03]
Joseph Waller
The test case I provided was done using the mysql.exe binary that was packaged with the 4.1.3b-beta server distribution (Windows). I did, however, get the exact same results when using Connector/ODBC (3.51.07). Here are the results from the queries you requested:
mysql> SELECT HEX('日本語');
+---------------+
| HEX('日本語') |
+---------------+
| 93FA967B8CEA |
+---------------+
1 row in set (0.00 sec)
mysql> insert into w (a,b) values ('English', '日本語');
Query OK, 1 row affected, 1 warning (0.00 sec)
mysql> show warnings;
+---------+------+----------------------------------------+
| Level | Code | Message |
+---------+------+----------------------------------------+
| Warning | 1265 | Data truncated for column 'b' at row 1 |
+---------+------+----------------------------------------+
1 row in set (0.00 sec)
mysql> select * from w;
+---------+------+
| a | b |
+---------+------+
| English | |
+---------+------+
1 row in set (0.00 sec)
It seems to behave this way for all Japanese inserts, not just the string I attempted. Here is another example.
mysql> SELECT HEX('あいうえお');
+----------------------+
| HEX('あいうえお') |
+----------------------+
| 82A082A282A482A682A8 |
+----------------------+
1 row in set (0.00 sec)
mysql> insert into w (a,b) values ('abcde', 'aiueoあいうえお');
Query OK, 1 row affected, 1 warning (0.00 sec)
mysql> show warnings;
+---------+------+----------------------------------------+
| Level | Code | Message |
+---------+------+----------------------------------------+
| Warning | 1265 | Data truncated for column 'b' at row 1 |
+---------+------+----------------------------------------+
1 row in set (0.00 sec)
mysql> select * from w;
+--------------------+-------+
| a | b |
+--------------------+-------+
| English | |
| abcde | aiueo |
+--------------------+-------+
3 rows in set (0.00 sec)
[25 Aug 2004 6:14]
Alexander Barkov
Sorry, but the bug system is not the appropriate forum for asking support questions. Your problem is not the result of a bug. For a list of more appropriate places to ask for help using MySQL products, please visit http://www.mysql.com/support/ Thank you for your interest in MySQL. Additional info: This is not a bug. You use SJIS character set in the client programs, not UTF8. But the server doesn't know about it. To make this work you should run either "SET CHARACTER SET sjis;" or "SET NAMES sjis;" in the beginning of the sessions. They are a bit different, please refer to our manual to choose the best for you. But I think the first one should be better in your case.
[25 Aug 2004 12:59]
Joseph Waller
My apologies.

Description: Japanese texts are trucated and removed from character strings being inserted into a UTF8 column. This did not occur in previous developement with version 4.1.1a-alpha. How to repeat: Here is copy of the text from the client window i used to reproduce this error. mysql> create table w -> ( -> a char(50), -> b char(50) -> ); Query OK, 0 rows affected (0.11 sec) mysql> status; -------------- mysql Ver 14.5 Distrib 4.1.3b-beta, for Win95/Win98 (i32) Connection id: 2 Current database: jhotdata Current user: root@localhost SSL: Not in use Using delimiter: ; Server version: 4.1.3b-beta Protocol version: 10 Connection: localhost via TCP/IP Client characterset: utf8_general_ci Server characterset: utf8_general_ci TCP port: 3306 Uptime: 10 min 17 sec Threads: 2 Questions: 77 Slow queries: 0 Opens: 9 Flush tables: 1 Open tables: 2 Queries per second avg: 0.125 -------------- mysql> show variables like 'character%'; +--------------------------+--------------------------+ | Variable_name | Value | +--------------------------+--------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | C:\mysql\share\charsets/ | +--------------------------+--------------------------+ 7 rows in set (0.00 sec) mysql> insert into w (a,b) values ('English', '日本語'); Query OK, 1 row affected, 1 warning (0.00 sec) mysql> show warnings; +---------+------+----------------------------------------+ | Level | Code | Message | +---------+------+----------------------------------------+ | Warning | 1265 | Data truncated for column 'b' at row 1 | +---------+------+----------------------------------------+ 1 row in set (0.00 sec) mysql> select * from w; +---------+------+ | a | b | +---------+------+ | English | | +---------+------+ 1 row in set (0.00 sec) mysql> show full columns from w; +-------+----------+-----------------+------+-----+---------+-------+---------------------------------+---------+ | Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment | +-------+----------+-----------------+------+-----+---------+-------+---------------------------------+---------+ | a | char(50) | utf8_general_ci | YES | | NULL | | select,insert,update,references | | | b | char(50) | utf8_general_ci | YES | | NULL | | select,insert,update,references | | +-------+----------+-----------------+------+-----+---------+-------+---------------------------------+---------+ 2 rows in set (0.00 sec)