Bug #61139 | Introduce a method to handle tables storing data with incorrect encoding | ||
---|---|---|---|
Submitted: | 12 May 2011 0:02 | Modified: | 12 May 2011 1:19 |
Reporter: | Daniel Webster | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | Connector / J | Severity: | S3 (Non-critical) |
Version: | 5.1.15 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | character_encoding |
[12 May 2011 0:02]
Daniel Webster
[12 May 2011 0:40]
Mark Matthews
This feature already exists connection-wide, see the property "useOldUTF8Behavior" in the docs at http://dev.mysql.com/doc/refman/5.5/en/connector-j-reference-configuration-properties.html Doing this on any other scope is really tough, as the driver doesn't actually ever fully parse *any* SQL, so it doesn't really know what database, table or column is "in scope".
[12 May 2011 1:17]
Daniel Webster
useOldUTF8Behavior: Use the UTF-8 behavior the driver did when communicating with 4.0 and older servers That is what is in the documentation. Does this mean "override character encoding found in the database"?
[12 May 2011 1:19]
Daniel Webster
Also, this "the driver doesn't actually ever fully parse *any* SQL" makes me think you are talking about the encoding of values in the SQL statement, which would affect the values in SQL where clauses, but I was mostly referring to the character encodings of values returned in a result set... that is, when I do a rs.getString(1); what character encoding is used to turn the bytes into Strings
[12 May 2011 13:28]
Mark Matthews
Daniel, The issue as I see it as a solution that scopes by database and table name needs to be *complete*, i.e. for reading-and-writing. In any case, the "old" way is to do exactly what you state. The driver tells the database that it's speaking "latin1", but really sending and reading UTF-8.