Bug #29853 Cannot use character encoding with connection parameters
Submitted: 17 Jul 2007 20:56 Modified: 31 Aug 2007 16:19
Reporter: Nathan Sharp
Status: Closed
Category:Connector/J Severity:S2 (Serious)
Version:5.0.6 OS:Any (Tested on Ubuntu Feisty and Windows XP)
Assigned to: Target Version:
Tags: charset encoding unicode username user

[17 Jul 2007 20:56] Nathan Sharp
Description:
If your username, password, or database name contain any extended characters, such as
Japanese characters, you will not be able to make the connection using the Java based
connector.  I have tried this against:
Ubuntu Feisty and Windows XP
MySQL Connector/J 5.0.6 and 3.1.12
MySQL Server v5.0.41 and v5.0.19

How to repeat:
Create a database and add a user with Japanese characters as the username (e.g. grant all
on mydb.* to 'ユーザ名'@'localhost' identfied by 'bogus').  You can correctly use this
account from the command line or from the MySQL Query Browser.  From a Java program, issue
the following commands:

a = "ユーザ名";
d = new com.mysql.jdbc.Driver();
c = java.sql.DriverManager.getConnection("jdbc:mysql://localhost/mydb", a, "bogus");

You will receive:

java.sql.SQLException: Access denied for user 'ユã'@'localhost' (using
password: YES)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:946)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2934)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:870)
        at com.mysql.jdbc.MysqlIO.secureAuth411(MysqlIO.java:3333)
        at com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1232)
        at com.mysql.jdbc.Connection.createNewIO(Connection.java:2749)
        at com.mysql.jdbc.Connection.<init>(Connection.java:1553)
        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:285)
        at java.sql.DriverManager.getConnection(DriverManager.java:525)
        at java.sql.DriverManager.getConnection(DriverManager.java:171)
...

Suggested fix:
The problem is the about the same as bug #18086, although I doubt the fix for that issue
will work here.  The code as it stands in 5.0.6 just falls back on the default encoding of
the Java platform to send the username, password, and database names to the server,
regardless of how the server is configured.  The fix for 18086 hard-codes that to Cp1252,
which won't help with Japanese characters.  The code needs to query the server for the
proper encoding to use in the same fashion that it does after the connection has been
established.
[17 Jul 2007 22:33] Mark Matthews
The question I have is what should be the correct behavior? At the point where we can set
the character set for authentication, we don't know what the server is using. We could
either use the "encoding" parameter passed in the JDBC URL (if there is one), or we could
use UTF-8, except that some characters outside the BMP aren't handled by MySQL's
implementation of UTF-8.

I don't have a preference for either one, but UTF-8 seems more seamless except for the
corner cases when the characters end up as 4-byte sequences.
[17 Jul 2007 23:18] Nathan Sharp
Preferentially I'd like to see it do whatever the command line tool and MySQL Query
Browser do.  Blanket using utf-8 will likely solve it, though, and certainly is easier :-)
 

I'm not really familiar enough with utf-8 and the languages we are using to know if your
concern is a problem or not for me.
[18 Jul 2007 1:12] Miguel Solorzano
See bug: http://bugs.mysql.com/bug.php?id=29576 regarding the same issue
with C API.
[31 Jul 2007 7:36] Tonci Grgin
Nathan, Mark, I will set this report to "Verified" as we are aware of this problem which
can be partially fixed in c/J. Problem remains with characters outside BMP and with 7bit
encodings. I think this should be documented properly too...
[31 Jul 2007 14:17] Nathan Sharp
Thank you Tonci!
[29 Aug 2007 21:20] Mark Matthews
This is fixed in the 5.1 source repository, it will be part of 5.1.3.
[31 Aug 2007 16:19] MC Brown
A note has been added to the 5.1.3 changelog: 

Connector/J now connects using an initial character set of utf-8 solely for the purpose of
authentication to allow user names or database names in any character set to be used in
the JDBC connection URL.