Bug #36680 Problems with binary data and gbk character set
Submitted: 13 May 2008 5:08 Modified: 17 Jun 13:20
Reporter: Todd Farmer
Status: Can't repeat
Category:Server: Charsets Severity:S2 (Serious)
Version:5.0.30 OS:Any
Assigned to: Sergey Gluhov Target Version:5.0+
Triage: Triaged: D3 (Medium)

[13 May 2008 5:08] Todd Farmer
Description:
When dealing with certain binary data, escaped single-quote data appears to generate
syntax errors when the gbk character set is used.  The single-quotes are properly
escaped, and the problem appears to not affect all instances (the error message indicates
a position that is beyond other instances, perhaps related to the character which
follows).

How to repeat:
Able to reproduce using the attached Java program.

Eliminating the characterEncoding=gbk connection parameter produces no error, and the
data is inserted correctly.

Suggested fix:
Make gbk recognize escaped single-quotes correctly.
[13 May 2008 5:09] Todd Farmer
Java test case

Attachment: Csc25412.java (application/octet-stream, text), 2.84 KiB.

[14 May 2008 15:36] Tonci Grgin
Bug#36706 was marked as duplicate of this report.
[25 Dec 2008 11:59] Alexander Barkov
Todd, 

Can you please try the same program with useServerPrepStatements=true
in connection string:

jdbc:mysql://localhost:3310/test?useServerPrepStatements=false&useUnicode=true&characterEncoding=gbk",
"root", null);

Does it work as expected?
[26 Jan 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[26 Jan 22:13] Todd Farmer
The error does not happen when useServerPrepStmts=true; the data is escaped.  Here is
output to general query log with useServerPrepStmts=true:

090126 13:10:04	      5 Prepare     [1] insert into
user_info_appeal(userid,createtime,logintime,otherinfo,personalid,personalidpic,email,status,submittime)
values(?,NOW(), NOW(),?,?,?,?,?,NOW())
		      5 Execute     [1] insert into
user_info_appeal(userid,createtime,logintime,otherinfo,personalid,personalidpic,email,status,submittime)
values(1,NOW(), NOW(),0x4E2F41,0x4E2F41,0xFFD8FFE000104A46494600...

The same statement run with useServerPrepStmts=false:

090126 13:04:40	      1 Query       insert into
user_info_appeal(userid,createtime,logintime,otherinfo,personalid,personalidpic,email,status,submittime)
values(1,NOW(),
NOW(),'N/A','N/A',_binary'ÿØÿà\0JFIF\0\0\0\0\0ÿá!Exif\0\0II*\0\0\0\0\0\0¤\0\0>\0\0\0
\0\0\0\0\0\0\0\0\0\0â\0\0œ\0\...
[2 Feb 14:01] Alexander Barkov
Todd,

The problem is on the server side.
MySQL does not distinguish between a character string
and a binary string at parse time - this is the problem.
Using prepared statements is the safest way to insert
BLOBs.
[17 Jun 11:51] Sergey Gluhov
We have three problems here:

1. Syntax error if multi-byte charsets and 'useServerPrepStatements=false'
is used. This is an old problem, see bug#8064.
A workaround is to use 'useServerPrepStatements=true'. Another question is
why it works(encoding to hex) only if 'useServerPrepStatements=true' is set.
It seems to me in case of multi-byte charsets binary data should always be
encoded as hex. It should be fixed in Connector/J.

2. Replication
I used  'At the Arch.jpg' file(attached to the bug report) and.
Csc25412.java test with 'useServerPrepStatements=true'.
Everything works fine and data is replicated on the slave without any
problems. So we don't need to fix anything here.

3. Broken binlog file.
There is 'binlogoutput.txt' which gives 'syntax error'.
Unfortunately I have no idea how to repeat this problem
without original data. 'At the Arch.jpg' file does not
correspond to 'binlogoutput.txt'(check the size of files)
so 'At the Arch.jpg' can not be used for testing.
We can set the bug to 'need feedback' status and
and wait for answer from Todd.