Bug #5490 PreparedStatement.setBytes() and setBinaryStream() is escaping data improperly
Submitted: 9 Sep 2004 12:03 Modified: 13 Sep 2004 16:22
Reporter: Chris Chen Email Updates:
Status: Can't repeat Impact on me:
Category:Connector / J Severity:S2 (Serious)
Version:3.0.15 OS:Linux (SuSE Linux SLES 9)
Assigned to: CPU Architecture:Any

[9 Sep 2004 12:03] Chris Chen
Unfortunately, I am unable to create a test case for you at the moment.  Here is a quick synopsis.

MySQL Server 4.1.3-beta running on Linux
Java JSDK 1.4.2_04
MySQL Connector/J 3.0.14 and 3.0.15
Database Table is in SJIS
MySQL JDBC Driver setting is useUnicode=true&characterEncoding=sjis


I am trying to read in an image attachment from the email and store it into the database.  My data field in the table is MEDIUMBLOB.  I am doing very simple insertion of the image data into the database.

PreparedStatement ps = conn.prepareStatement(...);
ps.setBinaryStream(1, imageInputStream, length);
ps.setBytes(1, imageData);

My sample image file that I try to insert is 871 bytes.  To determine that reading the data is not the issue, I read the entire stream into a byte array and the length is exactly 871.  I insert the data and execute the statement.  The data gets saved into the database just fine.

Then, I retrieve the data from the database.  I use ps.getBytes(1).   Herein lies the problem. Somehow the data retrieved got bloated.  The data retrieved from the server is 877 bytes long.  I do not know how that extra 6 bytes got in there.  

I saved the new data out into a file and compared it with the original gif file.  Somehow some bytes were escaped into something different.  

I checked the source code for mysql connection/j, and it appears that data are being escaped before being sent out to the server.  My possible guess that I can think of is that it is improperly escaping some characters.

This is causing a huge headache.  I am not sure exactly what is causing it.

How to repeat:
Simply read some binary image file into the database, store it, then retrieve it and see if the binary data length and contents are still the same.

Perhaps even test it for data under 1k and data above 1k.

Suggested fix:
None so far.  I am unable to determine the issue that is causing this.
[11 Sep 2004 1:27] Eric Herman
MySQL Server 4.1.4-beta running on Linux
Java JSDK 1.4.2_05 
MySQL Connector/J Latest 3.0.x from source control... It may be worth trying with 3.0.15

JDBC Conection: jdbc:mysql:///test

created table in "test", upload a file and then query for the contents and assert the sizes match. Seems to work fine:

            this.stmt.executeUpdate("DROP TABLE IF EXISTS testBug5490");
            this.stmt.executeUpdate("CREATE TABLE testBug5490"
                    + "(pk INT NOT NULL PRIMARY KEY, blobField BLOB)");
            String sql = "insert into testBug5490 values(?,?)";

            int blobFileSize = 871;
            File blobFile = newTempBinaryFile("Bug5490", blobFileSize);

            PreparedStatement pStmt = conn.prepareStatement(sql,
            pStmt.setInt(1, 2);
            FileInputStream str = new FileInputStream(blobFile);
            pStmt.setBinaryStream(2, str, blobFileSize);
            ResultSet newRs = this.stmt.executeQuery("SELECT blobField FROM testBug5490");
            byte[] returned = newRs.getBytes(1);

            assertEquals(blobFileSize, returned.length);
[11 Sep 2004 1:33] Chris Chen
I just read your comment after I posted the previous comment.  You are using 4.1.4-beta.  I will try upgrading to this version to see if it takes care of the problem.
[11 Sep 2004 1:43] Chris Chen
somehow my previous comment wasn't posted.  

I have narrowed down the bug to something much more specific now.  The gif file contains 00 hex bytes.  Looking in the mysql connector/j PreparedStatement code, the setBytes() method escapes null bytes into two characters "\0".  When this insert is submitted into the database, the binary stream stores it as "\0" in the database.  The server did not unescape it.  

Very funny thing also is the fact that there are plenty of 00 bytes in the gif file.  However, only certain 00 bytes were escaped but not unescaped at the server side.  This time, I will post two files along with the bug.  One file is the original gif, the other file is the retrieved data from the database.  You can do a quick visual diff to see the difference.
[11 Sep 2004 8:25] Chris Chen
Also, I forgot to mention that all my tables are using MyISAM with SJIS character sets.  Could this be related to some escaping of eastern unicode characters such as SJIS characters?
[12 Sep 2004 17:01] Chris Chen
Today, I finally narrowed it down to the specifics instance and reproducible problem.

This issue only occurs when my characterEncoding is set to sjis.  I tested the binary data storage when characterEncoding is set to utf8 and everything worked fine.  However, when the characterEncoding is set to sjis, I cannot save jpeg files or many binary-based files properly.  I even get character escape errors and problems.  

If the stream is binary, my guess is that the binary stream should not be going through any sort of character encoding translations.  I am not sure what the exact cause of the problem is, but I can see now that it's related to the characterEncoding setting in the JDBC connection URL.
[13 Sep 2004 16:22] Mark Matthews
This appears to be some issue with the parser in the server, as another engineer in the support team was able to repeat this behavior using libmysql and mysql_escape_string.

In the meantime, you might consider using Connector/J 3.1.4, as the testcase works there. Prepared Statements in Connector/J 3.1.x are server-based, and have a different way of sending binary data that doesn't need to be escaped at all, as there is protocol-level support to do so. Connector/J 3.0.x doesn't have support for this.
[13 Sep 2004 16:48] Chris Chen
Thank you for the comment.  I am glad that it's not my imagination.
I will take your advice and try out 3.1.x version and see how it goes