Bug #23619 Incorrectly escaped multibyte characters in binary log break replication
Submitted: 25 Oct 2006 10:28 Modified: 23 Jan 2007 18:14
Reporter: Domas Mituzas Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.0-bk, 5.1-bk (2006 10 24) OS:Any (Any)
Assigned to: Alexander Barkov CPU Architecture:Any
Tags: bfsm_2006_11_02, bfsm_2007_01_18, charsets, replication, sjis, stored procedure

[25 Oct 2006 10:28] Domas Mituzas
Description:
If a connection, which has 'binary' character set executes a stored procedure, which writes multibyte data, incorrectly escaped entries are written to binary log, therefore causing syntax errors and replication termination. 

Example character may be character (150,92) - ending with \. 

How to repeat:
Run attached MySQL testcase. 

Suggested fix:
Escape entries properly.
[25 Oct 2006 10:29] Domas Mituzas
testcase...

Attachment: rpl_binary_sjis.test (application/octet-stream, text), 670 bytes.

[9 Nov 2006 10:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/15077

ChangeSet@1.2297, 2006-11-09 14:27:34+04:00, bar@mysql.com +5 -0
  Bug#23619 Incorrectly escaped multibyte characters in binary log break replication
  Problem: when embedding a character string with introducer with charset X
  into a SQL query which is generally in character set Y, the string constants
  were escaped according to their own character set (i.e.X), then after reading
  such a "mixed" query from binlog, the string constants were unescaped
  using character set of the query (i.e. Y), instead of X, which gave wrong
  results or even syntax errors with tricky charsets (e.g. sjis)
  Fix: when embedding a string constant of charset X into a query of charset Y,
  the string constant is now escaped according to character Y, instead of
  its own character set X.
[14 Nov 2006 10:44] Sergei Glukhov
ok to push
[22 Nov 2006 20:29] Hugo Zhu
I had same issue with GBK characters.
Env:
Server: 5.0.24-max-log, ENGINE=MyISAM DEFAULT CHARSET=gbk;

JDBC Driver, tried both 3.x and 5.x

Using PreparedStatement
The master can execute the insert/update without problems, but when slave reads events from binlog, it get a wrong SQL statement without correct escaping.

Test codes:
        byte[] bytes = new byte[]{(byte)0xbf,(byte)0x5c};
        String test = new String(bytes,"GBK");
or 
        String test = "診";

When inserting such a single character to master, slave will be terminated because of invalid sql.

Hugo
[24 Nov 2006 11:50] Alexander Barkov
Hugo,

What does "SHOW VARIABLES LIKE 'character_set'" display for you on master
using the same JDBC connection?

Thanks!
[29 Nov 2006 10:21] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented fix for 5.0.30 and 5.1.14.
[18 Dec 2006 16:42] Daniel Fischer
This bug is not fixed in the versions mentioned above due to human error.
[23 Jan 2007 18:14] Paul DuBois
Noted in 5.0.36, 5.1.15 changelogs.
[28 Feb 2007 9:01] Sveta Smirnova
Bug #26711 was marked as duplicate of this one
[2 Mar 2007 2:02] Yuan WANG
Yes, I have tested the patch and found it fixed #26711 as well. Thank you
[5 Mar 2007 6:25] Yuan WANG
Hi, after a more thorough test I found that the patch doesn't fix bug #26711 in some situation. Here are instructions for repeating the bug.

First, create a database as follows.

mysql> create database test charset gbk;

Note that the charset of the database must be other than latin1 to show this bug.

Then create a table and insert some data.

mysql> use test;
mysql> CREATE TABLE `test_blob` (
   `a` blob
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
mysql> insert into test_blob values(0xEE00);

And then dump the data and reload it.

$ mysqldump -T . -u root test test_blob;
$ mysql -u root test
mysql> truncate test_blob;
mysql> load data local infile '/tmp/test_blob.txt' into table test_blob;  
mysql> select hex(a) from test_blob;
+--------+
| hex(a) |
+--------+
| EE5C30 | 
+--------+
1 row in set (0.00 sec)

The key point is to set the charset of the database to gbk. If it is latin1, all will be OK.
[5 Mar 2007 11:00] Sveta Smirnova
Thank you for the additional comment.

Will mark bug #26711 as verified to let developer who will fix it decide if these 2 are same or not.
[5 Mar 2007 11:01] Sveta Smirnova
Bug #26711 contains test case form last comment.