Description:
Ok, this method has caused several problems before and is fixed again and again. But I think it is still broken, at least for GBK charset. I have no idea why it is necessary to escape multibyte character bytes in strings, the manual says that escape should consider charset, and points to the function mysql_real_escape_string(). After browsing through the source code, it seems that the special treatement for multibyte character is to skip them instead of escape them, so the very idea of escapeEasternUnicodeByteStream confuses me.
The only explanation I can think of is to somehow work around a bug which makes it impossible to insert blobs into GBK database (This bug seems to be reported multiple times, but ignored), although I don't understand how escaping strings (instead of blobs) could solve it.
How to repeat:
Test case is modeled after Connector/J's unit test, the test is run against a windows installation of mysql 4.0.20a, with default-character-set set to gbk. Result is:
java.sql.SQLException: Syntax error or access violation, message from server: "You have an error in your SQL syntax. Check the manual that corresponds to your MySQL server version for the right syntax to use near ''?\')' at line 1"
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:1977)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1163)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1272)
at com.mysql.jdbc.Connection.execSQL(Connection.java:2236)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1741)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1588)
at test.MySQLConnectorJavaTest.doGBKEscapeBugTest(MySQLConnectorJavaTest.java:58)
at test.MySQLConnectorJavaTest.testGBKEscapeBug(MySQLConnectorJavaTest.java:26)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at junit.framework.TestCase.runTest(TestCase.java:154)
at junit.framework.TestCase.runBare(TestCase.java:127)
at junit.framework.TestResult$1.protect(TestResult.java:106)
at junit.framework.TestResult.runProtected(TestResult.java:124)
at junit.framework.TestResult.run(TestResult.java:109)
at junit.framework.TestCase.run(TestCase.java:118)
at junit.framework.TestSuite.runTest(TestSuite.java:208)
at junit.framework.TestSuite.run(TestSuite.java:203)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:392)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:276)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:167)
******************************************************************************
/*
* Created on 2004-6-4
*/
package test;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.Statement;
import java.util.Properties;
import junit.framework.TestCase;
/**
* @author RJX
*/
public class MySQLConnectorJavaTest
extends TestCase
{
public void testGBKEscapeBug()
throws Exception
{
//doGBKEscapeBugTest("\u4e2d\u6587\u5904\u7406"); // sanity check
doGBKEscapeBugTest("\u603d"); // this is a valid Chinese char
//doGBKEscapeBugTest("?"); // it looks like this if you can read Chinese
}
public void doGBKEscapeBugTest(String testString)
throws Exception
{
String dbUrlBase = "jdbc:mysql:///";
String dbName = "gbktest";
String tableName = "gbkEscapeTest";
Properties connProps = new Properties();
connProps.put("useUnicode", "true");
connProps.put("characterEncoding", "GBK");
connProps.put("user", "root");
Connection gbkConn = DriverManager.getConnection(dbUrlBase + "test", connProps);
Statement gbkStmt = gbkConn.createStatement();
ResultSet rs = null;
try
{
gbkStmt.executeUpdate("DROP DATABASE IF EXISTS " + dbName);
gbkStmt.executeUpdate("CREATE DATABASE " + dbName);
gbkStmt.close();
gbkConn.close();
gbkConn = DriverManager.getConnection(dbUrlBase + dbName, connProps);
gbkStmt = gbkConn.createStatement();
gbkStmt.executeUpdate("DROP TABLE IF EXISTS " + tableName);
gbkStmt.executeUpdate("CREATE TABLE " + tableName + " (field1 varchar(10))");
PreparedStatement sjisPStmt = gbkConn.prepareStatement("INSERT INTO " + tableName + " VALUES (?)");
sjisPStmt.setString(1, testString);
sjisPStmt.executeUpdate();
rs = gbkStmt.executeQuery("SELECT * FROM " + tableName);
rs.next();
String retrString = rs.getString(1);
assertEquals(testString, testString, retrString);
}
finally
{
gbkStmt.executeUpdate("DROP TABLE IF EXISTS " + tableName);
if(rs != null) rs.close();
gbkStmt.close();
gbkConn.close();
}
}
/* (non-Javadoc)
* @see junit.framework.TestCase#setUp()
*/
protected void setUp() throws Exception
{
Class.forName("com.mysql.jdbc.Driver").newInstance();
}
}
******************************************************************************
Some other food for thought:
1. Set default-character-set to latin1, while keeping the useUnicode/characterEncoding setting will make the test case pass.
2. Set default-character-set to latin1, and remove the useUnicode/characterEncoding setting will make the test case fail due to assertion.
Suggested fix:
Depends on what exactly this method is supposed to do. What happened is that when calling setString on Chinese char \u603d, which is 0x905C under GBK encoding, the 0x5C part got escaped into 0x5C5C. After the escaped result 0x905C5C is shipped to the server, the server's ctype-gbk recognizes that 0x905C is a GBK char and skips it, leaves the last 0x5C which pairs up with the ending single quote and messed up the sql. So just as stated in the begining, the cause of the problem seems to be that this method is against mysql server's rule on handling multibyte char escapes.