Bug #3114 | Unable to save utf8 data into MySql with JAVA Connector | ||
---|---|---|---|
Submitted: | 9 Mar 2004 6:18 | Modified: | 24 Nov 2004 20:28 |
Reporter: | nicolas nicolas | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | Connector / J | Severity: | S2 (Serious) |
Version: | 4.1.1a | OS: | Solaris (Solaris / Windows) |
Assigned to: | Mark Matthews | CPU Architecture: | Any |
[9 Mar 2004 6:18]
nicolas nicolas
[17 Mar 2004 7:54]
Mark Matthews
If you are going to store UTF-8 data in MySQL-4.1.1 or newer, your tables need to have a compatible collation as well (it appears that this was not checked in versions prior to 4.1.1). See http://www.mysql.com/doc/en/Charset-Unicode-sets.html and http://www.mysql.com/doc/en/Charset-examples.html for more information, where the last link explains that the database default collation (latin1_swedish_ci) will be used if you don't specify one for your table.
[25 Mar 2004 2:20]
nicolas nicolas
Thanks for your answer I tried to add COLLATION information to the table. CREATE TABLE Translate ( translationValue char(255), isMarked smallint(6) NOT NULL default '0', PRIMARY KEY (languageId,constrainId) ) DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci; When I tried a SELECT like select CONVERT(translationValue USING utf8) from translate select translationValue from translate I have always ??? instead of special characters ... I become out of more ideas . Thanks again. Help is welcome.
[24 Nov 2004 16:42]
Hontvari Jozsef Levente
I think this is indeed a bug in the Java connector, it affected me too. It cannot be easily fixed if different client apps access the database: one with useUnicode etc., the other is without that. The same connection string and database worked well in a previous mysql version now it doesn't - so this is at least an incompatibility. Now the question is which one is responsible: mysql server or java connector. I believe the change in the mysql server didn't necessarily cause any incompatibility. Previously the useUnicode=true&characterEncoding=UTF8 was a hack to store utf-8 in a latin-1 database, which was needed because mysql didn't support UTF-8. Mysql server never knew about this parameter! The driver did all the dirty work. It seems to me that the meaning of the very same parameters are changed, and now (instead of doing the same old hack) the driver tells the server that he wants to communicate in UTF-8! The two thing is completely different and unrelated! Of course it won't work. The connector should have used another parameter name for that functionality, and it should have left the old behaviour unchanged.
[24 Nov 2004 20:28]
Mark Matthews
The new behavior in the server doesn't let you have it both ways. Consider the 'old hack' deprecated. If you want to use the new server, you'll have to fix your application to not use a 'hack'.
[18 Feb 2005 17:07]
Javier Martinez
Hello all, I have the same problem as you (same version of Mysql and connector) Do you have any update on it? thanks in advance, Javier
[7 Dec 2006 4:11]
Adam SI
I have come to a solution for Chinese UTF-8, may it works for you :) The context is * ubuntu linux 6.10 * locale zh_cn.UTF-8 * mysql-server-5.0 * mysql connector/J 5.0.4 The essential part is adding "--character-set-server utf8 --collation-server=utf8_general_ci" to the startup command of mysqld_safe ( in my case it is in /etc/init.d/mysql ) and then it all works, no need to add characterEncoding to connection string, no need to "SET NAMES utf8", no need to do the "getBytes(xxx)" trick. But you still have to add CHARACTER SET and COLLATION clause to CREATE DATABASE and CREATE TABLE You can have a try and tell us if it works for your language. Adam
[16 Feb 2007 14:35]
Mike Salsbury
I've tried this until it worked. I got it to work on 5.0 on Windows XP, and 4.1.15 on Red Hat Enterprise 3. The 5.0 version was installed 'to support many languages'. I assume this makes utf8 the default character set and collation. The 4.1.15 has latin1 as the default character set and collation. This configuration worked on both. 1. The column in the database has to have a character set and collation for utf8. I.e. CHARACTER SET utf8 COLLATE utf8_general_ci 2. The request before storing to the database has to be handled as UTF-8. I.e. needs a request.setCharacterEncoding("UTF-8"); line before any submitted parameters are read. This was the vital missing piece of the jigsaw for me. 3. The response also needs to be handled as UTF-8. I.e. needs a response.setContentType("text/html; charset=UTF-8"); line in all files handling the response. Not totally sure if this is required as one file worked without it. 4. The character set and encoding need to be set on the database URL in the web.xml. E.g. <url>jdbc:mysql://localhost:3306/DATABASENAME?useUnicode=true&characterEncoding=UTF8</url>
[19 Jun 2007 6:00]
Peter L
Adam SI's comment worked for me: Context: CentOS 4.4 MySQL 4.1 MSql Connector 5.0.4 for Java Table and columns are all set to utf8 and utf8_unicode_ci. After inserting the command line parameters to myssqld_safe to the /etc/init.d/mysqld file's start() section, it is all peachy and I can store Japanese characters. The JSPs are set to SJIS, but JAva converts internally to Unicode and the driver now converts to utf8, all is well.