Bug #9039 characterEncoding no longer works for blob
Submitted: 8 Mar 2005 5:11 Modified: 17 Mar 2005 14:15
Reporter: Wai Wong Email Updates:
Status: Won't fix Impact on me:
None 
Category:Connector / J Severity:S2 (Serious)
Version:3.1.7 OS:
Assigned to: CPU Architecture:Any

[8 Mar 2005 5:11] Wai Wong
Description:
In MySql 4.0.x server, using characterEncoding=big5 can retrieve big5 data stored in blob type.  In 4.1.x, this does not work.  This is a regression.

We cannot change the blob type to "text character set big5" because the data are read from IIS via ASP too.  The MyODBC is not yet support charset.

We are using JSP and ASP to extract the data.  Workaround, if any, is also welcomed.

Wai Wong.

How to repeat:
Store big5 data into 4.1.x text using any jsp.  Check reading with jsp.  The data is ok.
Modify the field to binary.
Set the parameter characterEncoding.
Use jsp to retrieve.  The data are corrupted.

Suggested fix:
In case of binary data, the characterEncoding parameter should be effective.
[14 Mar 2005 17:15] Mark Matthews
You can't really have a character encoding on a BLOB in MySQL-4.1, there's not much the JDBC driver can do about features that don't exist in the server.

A workaround would be to use ResultSet.getBytes(), and convert that to a string with new String(bytes, "Big5").
[17 Mar 2005 14:15] Mark Matthews
Why are you storing text data in a BLOB? 

This can't be fixed in the driver because 1) It's not JDBC compliant, and 2) It's not how everybody else expects things to work, as binary data has no character set and shouldn't be treated like it does.

Is there a reason you're not storing this text data in TEXT fields or VARCHAR fields?
[8 Apr 2005 11:28] Vladimir Stavrinov
Developers using blob because it's size is much more then text.
We have the same problem with cp1251. With 3.1.6 it return
question marks and with 3.1.7 there are something unreadable.
If this is binary object, why do You assign them ASCII charset?
That is exactly source of problem. Here is patch for 3.1.6 that work
for us. As for 3.17, it need some more fixing.

diff -ur mysql-connector-java-3.1.6-0/com/mysql/jdbc/CharsetMapping.java mysql-connector-java-3.1.6/com/mysql/jdbc/CharsetMapping.java
--- mysql-connector-java-3.1.6-0/com/mysql/jdbc/CharsetMapping.java     2004-12-23 22:37:40.000000000 +0300
+++ mysql-connector-java-3.1.6/com/mysql/jdbc/CharsetMapping.java       2005-04-07 15:37:34.936532302 +0400
@@ -101,7 +101,7 @@
         tempMap.put("utf8", "UTF-8");
         tempMap.put("ucs2", "UnicodeBig");
 
-        tempMap.put("binary", "US-ASCII"); // closest match
+        // tempMap.put("binary", "US-ASCII"); // closest match
 
         MYSQL_TO_JAVA_CHARSET_MAP = Collections.unmodifiableMap(tempMap);
         
diff -ur mysql-connector-java-3.1.6-0/com/mysql/jdbc/Messages.java mysql-connector-java-3.1.6/com/mysql/jdbc/Messages.java
--- mysql-connector-java-3.1.6-0/com/mysql/jdbc/Messages.java   2004-12-23 22:37:38.000000000 +0300
+++ mysql-connector-java-3.1.6/com/mysql/jdbc/Messages.java     2005-04-07 15:35:56.061732775 +0400
@@ -44,8 +44,7 @@
        static {
                try {
                        RESOURCE_BUNDLE = ResourceBundle.getBundle(BUNDLE_NAME, 
-                               Locale.getDefault(), 
-                               Messages.class.getClassLoader());
+                               Locale.getDefault()); 
                } catch (Throwable t) {
                        throw new RuntimeException("Can't load resource bundle due to underlying exception " + t.toString() + "\n\nStack Tace:\n\n" + Util.stackTraceToString(t));