Bug #879 PreparedStatement.setString() incorrectly processes GBK string
Submitted: 20 Jul 2003 17:57 Modified: 5 Mar 2004 14:25
Reporter: [ name withheld ] Email Updates:
Status: Closed Impact on me:
Category:Connector / J Severity:S1 (Critical)
Version:3.0.8-stable OS:Linux (linux)
Assigned to: Mark Matthews CPU Architecture:Any

[20 Jul 2003 17:57] [ name withheld ]
I'm running a mysqld 4.0.12 on a linux box. The default database charset is GBK. When I use PreparedStatement.setString() to insert a string containing double quote or single quote, the server responses a syntax error. I've traced into the PreparedStatement.java, and found that inside StringUtils.getBytes() there might be a bug when processing GBK charset bytes. In PreparedStatement.setString(), the string x will be escaped before calling StringUtils.getBytes(). But in StringUtils.getBytes(), if the transmitted string is encoded with SJIS or BIG5 or GBK, the escaped string will be escaped again. For example, pstmt.setString(1, "It's a test") will be escaped to "It\'s a test" before calling StringUtils.getBytes() in PreparedStatement.setString(), and will be escaped again to "It\\'s a test" after returning from getBytes(). 

Charsets other than the GBK/SJIS/BIG5 should not be influenced by this error. I've tested UTF8 for a mysqld 4.1, and everything seems ok.

How to repeat:
Create a database (we use test in this case) with GBK charset, and then create a table:
create table test(col varchar(200));

Write a java program testjdbc.java:
import java.sql.*;
public class testjdbc
        public static void main(String[] args)
                throws Exception
                try {
                        Connection con = DriverManager.getConnection("jdbc:mysql://localhost:3306/test?user=jdbc&password=jdbc");
                        PreparedStatement pstmt = con.prepareStatement("insert into test(col) values(?)");
                        pstmt.setString(1, "It's a test.");
                } catch (SQLException e) {
                        System.out.println(e.getSQLState() + ":" + e.getErrorCode());
                        throw e;

Compile and run it, and you'll get the error.

Suggested fix:
I comment out the codes that write additional 0x5c in method escapeSJISByteStream() (within lines from 180 to 200) in StringUtils.java. I don't know if this is the right way to solve the problem. The only thing I know is that it solves my already emerged problems. I wonder such modifications could cause other problems.
[29 Jul 2003 7:42] [ name withheld ]
I am having the same problem of having escape character duplicatted when using preparedstatement. My personnel opinion is that it should have something to do with StringUtils.java:Line180, consider a string with big5 encoding but contains only english character. That line will make all escape character in the string duplicated, but I think think its a preferred behavior. Below is my test case:
            Connection conn = DriverManager.getConnection("jdbc:mysql://localhost/testdb?useUnicode=true&characterEncoding=big5", "", "");
            //Statement stmt = conn.createStatement();
            PreparedStatement stmt = conn.prepareStatement("insert into TestTable (TestCol) values (?)");
            stmt.setString(1, "1234\\1234\\1234\n");

If I change characterEncoding to others like gb2312, then the result will be correct.
[7 Aug 2003 15:08] Mark Matthews
This is fixed in the tree for 3.0.x and 3.1.x. If you can't wait for the release of 3.0.9 or 3.1.1, you can test a nightly snapshot of either after 00:00 GMT on August 9, 2003 from http://mmmysql.sourceforge.net/snapshots/

Thank you for your bug report, and for using MySQL!

[13 Jan 2004 13:59] Mark Matthews
I'm re-opening this. To do it totally correct requires some heavy lifting that will have to wait until 3.0.11.
[5 Mar 2004 14:25] Mark Matthews
This should be fixed in the nightly builds of 3.0 and 3.1.