Bug #29853 | Cannot use character encoding with connection parameters | ||
---|---|---|---|
Submitted: | 17 Jul 2007 18:56 | Modified: | 31 Aug 2007 14:19 |
Reporter: | Nathan Sharp | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | Connector / J | Severity: | S2 (Serious) |
Version: | 5.0.6 | OS: | Any (Tested on Ubuntu Feisty and Windows XP) |
Assigned to: | CPU Architecture: | Any | |
Tags: | charset encoding unicode username user |
[17 Jul 2007 18:56]
Nathan Sharp
[17 Jul 2007 20:33]
Mark Matthews
The question I have is what should be the correct behavior? At the point where we can set the character set for authentication, we don't know what the server is using. We could either use the "encoding" parameter passed in the JDBC URL (if there is one), or we could use UTF-8, except that some characters outside the BMP aren't handled by MySQL's implementation of UTF-8. I don't have a preference for either one, but UTF-8 seems more seamless except for the corner cases when the characters end up as 4-byte sequences.
[17 Jul 2007 21:18]
Nathan Sharp
Preferentially I'd like to see it do whatever the command line tool and MySQL Query Browser do. Blanket using utf-8 will likely solve it, though, and certainly is easier :-) I'm not really familiar enough with utf-8 and the languages we are using to know if your concern is a problem or not for me.
[17 Jul 2007 23:12]
MySQL Verification Team
See bug: http://bugs.mysql.com/bug.php?id=29576 regarding the same issue with C API.
[31 Jul 2007 5:36]
Tonci Grgin
Nathan, Mark, I will set this report to "Verified" as we are aware of this problem which can be partially fixed in c/J. Problem remains with characters outside BMP and with 7bit encodings. I think this should be documented properly too...
[31 Jul 2007 12:17]
Nathan Sharp
Thank you Tonci!
[29 Aug 2007 19:20]
Mark Matthews
This is fixed in the 5.1 source repository, it will be part of 5.1.3.
[31 Aug 2007 14:19]
MC Brown
A note has been added to the 5.1.3 changelog: Connector/J now connects using an initial character set of utf-8 solely for the purpose of authentication to allow user names or database names in any character set to be used in the JDBC connection URL.