| Bug #61483 | connector/J not correctly handling Korean characters | ||
|---|---|---|---|
| Submitted: | 10 Jun 2011 16:59 | Modified: | 10 Jun 2011 18:01 |
| Reporter: | Peter Turk | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | Connector / J | Severity: | S2 (Serious) |
| Version: | MySQL 5.5.13, Connector/J 5.1.16 | OS: | Any (Windows XP, Mac OS X) |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | connector, utf8 | ||
[10 Jun 2011 17:25]
Mark Matthews
Connector/J only sends characters in UTF-8 if either the characterEncoding property in your connection string has been set to "UTF-8", or if character_set_server on MySQLd is set to "UTF-8", which essentially triggers the "SET NAMES ..." call. You are using "SET NAMES" in the mysql client, and workbench sets the connection to UTF-8 by default. Are either of the above conditions true in your testcase? If not, does setting them as described fix this issue?
[10 Jun 2011 18:01]
Peter Turk
MySQL staff sent me this message: Connector/J only sends characters in UTF-8 if either the characterEncoding property in your connection string has been set to "UTF-8", or if character_set_server on MySQLd is set to "UTF-8", which essentially triggers the "SET NAMES ..." call. I added "characterEncoding=utf8" to my connection string, and the problem disappeared. Thanks.

Description: Korean characters inserted into a MySQL database using Connector/J turns into question marks. Korean characters retrieved from a MySQL database using Connector/J turn into question marks. How to repeat: Create a database named utf8 with "character set=utf8". In MySQL Workbench, execute this: use utf8; create table utftest (id integer, name_last varchar(20)); insert into utftest (id,name_last) values(6,'서비스'); select * from utftest; { For convenience, the characters in the string above are 서 c11c 비 be44 스 c2a4 but the problem occurs with all Korean characters. These are just a random sample.} The select statement, in either MySQL Workbench or mysql command line, shows the correct string; Now repeat the select through connector/J 5.1.16 and the Korean characters have turned into question marks: select * from utftest 6, ??? The attempts to debug this have been performed on Mac OS X 10.6.7, MySQL 5.5.13, Connector/J 5.1.16, but we first encountered the problem on a Windows XP machine and have observed it on several older combinations of MySQL and Connector/J. When connecting with Connector/J, I do: SET NAMES 'utf8' to ensure that everything is in utf8. To confirm this, SHOW VARIABLES LIKE 'character_set%' produces this: character_set_client=utf8 character_set_connection=utf8 character_set_database=utf8 character_set_filesystem=binary character_set_results=utf8 character_set_server=latin1 character_set_system=utf8 Suggested fix: Fully support Korean characters through Connector/J. What is MySQL Workbench doing that my program cannot do through MySQL Connector?