Bug #11338 logging of prepared statement w/ blob type
Submitted: 15 Jun 2005 4:55 Modified: 24 Aug 2005 17:59
Reporter: Timothy Smith Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:4.1.12, 5.0.9 OS:Any (any)
Assigned to: Elliot Murphy CPU Architecture:Any

[15 Jun 2005 4:55] Timothy Smith
Description:

When running the following test case, and then using mysqlbinlog, I see this:

[tsmith@devsrv-a t]$ mysqlbinlog --start-position=217 --stop-position=291 ../data/devsrv-a-bin.000002 | xxd
0000000: 2f2a 2134 3030 3139 2053 4554 2040 4073 /*!40019 SET @@s
0000010: 6573 7369 6f6e 2e6d 6178 5f69 6e73 6572 ession.max_inser
0000020: 745f 6465 6c61 7965 645f 7468 7265 6164 t_delayed_thread
0000030: 733d 302a 2f3b 0a23 2061 7420 3231 370a s=0*/;.# at 217.
0000040: 2330 3530 3631 3320 2034 3a32 393a 3036 #050613 4:29:06
0000050: 2073 6572 7665 7220 6964 2033 3334 3130 server id 33410
0000060: 2020 6c6f 675f 706f 7320 3231 3720 0951 log_pos 217 .Q
0000070: 7565 7279 0974 6872 6561 645f 6964 3d31 uery.thread_id=1
0000080: 3309 6578 6563 5f74 696d 653d 3009 6572 3.exec_time=0.er
0000090: 726f 725f 636f 6465 3d30 0a75 7365 2074 ror_code=0.use t
00000a0: 6573 743b 0a53 4554 2054 494d 4553 5441 est;.SET TIMESTA
00000b0: 4d50 3d31 3131 3836 3239 3734 363b 0a49 MP=1118629746;.I
00000c0: 4e53 4552 5420 494e 544f 2062 6c6f 6274 NSERT INTO blobt
00000d0: 6573 7420 2862 2920 5641 4c55 4553 2028 est (b) VALUES (
00000e0: 2783 5c30 2729 3b0a '.\0');.

So the binlog shows ... VALUES( ('?\0');

(where X == 0x83).

If I load this via 'mysql' with the latin1 character set, the binary
data works correctly. If I load it with --default-character-set=cp932
then the binary data becomes 0x835C30, which is plainly wrong.

Is this a problem with logging prepared statements when the character
set is cp932? Note that if I add SET NAMES cp932 before running the
insert statement, then the wrong value is loaded into the table no
matter what --default-character-set is used for the 'mysql' client.

How to repeat:
Here is the test client.  I used MySQL 4.1.12 to test.

#include <mysql.h>

#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

MYSQL *db_connect(const char *dbname);
void db_disconnect(MYSQL *db);
static void do_test(MYSQL *db);

int main(void) {
    MYSQL *db = db_connect("test");
    do_test(db);
    mysql_close(db);

    exit(EXIT_SUCCESS);
}

static void die(MYSQL *db, char *fmt, ...) {
    va_list ap;
    va_start(ap, fmt);
    vfprintf(stderr, fmt, ap);
    va_end(ap);
    (void)putc('\n', stderr);
    if (db)
        db_disconnect(db);
    exit(EXIT_FAILURE);
}

MYSQL * db_connect(const char *dbname) {
    MYSQL *db = mysql_init(NULL);
    if (!db)
        die(db, "mysql_init failed: no memory");

    mysql_options(db, MYSQL_READ_DEFAULT_GROUP, "client");
    if (!mysql_real_connect(db, NULL, NULL, NULL, dbname, 0, NULL, 0))
        die(db, "mysql_real_connect failed: %s", mysql_error(db));

    return db;
}

void db_disconnect(MYSQL *db) {
    mysql_close(db);
}

static void do_test(MYSQL *db) {
    int error = 0;
    MYSQL_STMT *stmt;
    MYSQL_BIND bind[1];
    unsigned char buffer[1024];
    unsigned long result_len = 2;
    my_bool is_null = 0;

    const char *query = "INSERT INTO blobtest (b) VALUES (?)";

    fprintf(stderr, "Setup stuff...\n");
    /*mysql_query(db, "set names cp932");*/
    mysql_query(db, "drop table if exists blobtest");
    mysql_query(db, "create table blobtest (b blob)");

    memset(bind, 0, sizeof bind);
    bind[0].buffer_type = MYSQL_TYPE_BLOB;
    bind[0].buffer = buffer;
    bind[0].buffer_length = sizeof buffer;
    bind[0].length = &result_len;
    bind[0].is_null = &is_null;

    buffer[0] = 0x83;
    buffer[1] = 0x00;

    fprintf(stderr, "init()...\n");
    if ((stmt = mysql_stmt_init(db)) == NULL)
        goto err;
    fprintf(stderr, "prepare()...\n");
    if ((error = mysql_stmt_prepare(stmt, query, strlen(query))) != 0)
        goto err;
    fprintf(stderr, "bind_param()...\n");
    if ((error = mysql_stmt_bind_param(stmt, bind)) != 0)
        goto err;
    fprintf(stderr, "execute()...\n");
    if ((error = mysql_stmt_execute(stmt)) != 0)
        goto err;

    fprintf(stderr, "Done.\n");

    mysql_stmt_close(stmt);

    return;

err:
    die(db, "do_test failed: %d/%s [%s]",
        (error), mysql_error(db), query);
}

Here is a test run:

[tsmith@devsrv-a t]$ cc -o insblob -I../include -g -Wall -W insblob.c -L../lib -lmysqlclient -lm -lz
[tsmith@devsrv-a t]$ mysql -e 'show master status';
+---------------------+----------+--------------+------------------+
| File                | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+---------------------+----------+--------------+------------------+
| devsrv-a-bin.000002 |     1982 |              |                  |
+---------------------+----------+--------------+------------------+
[tsmith@devsrv-a t]$ ./insblob
Setup stuff...
init()...
prepare()...
bind_param()...
execute()...
Done.
[tsmith@devsrv-a t]$ mysqlbinlog --start-position=1982 ../data/*-bin.000002 | xxd
0000000: 2f2a 2134 3030 3139 2053 4554 2040 4073  /*!40019 SET @@s
0000010: 6573 7369 6f6e 2e6d 6178 5f69 6e73 6572  ession.max_inser
0000020: 745f 6465 6c61 7965 645f 7468 7265 6164  t_delayed_thread
0000030: 733d 302a 2f3b 0a23 2061 7420 3139 3832  s=0*/;.# at 1982
0000040: 0a23 3035 3036 3133 2020 353a 3238 3a34  .#050613  5:28:4
0000050: 3820 7365 7276 6572 2069 6420 3333 3431  8 server id 3341
0000060: 3020 206c 6f67 5f70 6f73 2031 3938 3220  0  log_pos 1982
0000070: 0951 7565 7279 0974 6872 6561 645f 6964  .Query.thread_id
0000080: 3d32 3609 6578 6563 5f74 696d 653d 3009  =26.exec_time=0.
0000090: 6572 726f 725f 636f 6465 3d30 0a75 7365  error_code=0.use
00000a0: 2074 6573 743b 0a53 4554 2054 494d 4553   test;.SET TIMES
00000b0: 5441 4d50 3d31 3131 3836 3333 3332 383b  TAMP=1118633328;
00000c0: 0a64 726f 7020 7461 626c 6520 6966 2065  .drop table if e
00000d0: 7869 7374 7320 626c 6f62 7465 7374 3b0a  xists blobtest;.
00000e0: 2320 6174 2032 3034 360a 2330 3530 3631  # at 2046.#05061
00000f0: 3320 2035 3a32 383a 3438 2073 6572 7665  3  5:28:48 serve
0000100: 7220 6964 2033 3334 3130 2020 6c6f 675f  r id 33410  log_
0000110: 706f 7320 3230 3436 2009 5175 6572 7909  pos 2046 .Query.
0000120: 7468 7265 6164 5f69 643d 3236 0965 7865  thread_id=26.exe
0000130: 635f 7469 6d65 3d30 0965 7272 6f72 5f63  c_time=0.error_c
0000140: 6f64 653d 300a 5345 5420 5449 4d45 5354  ode=0.SET TIMEST
0000150: 414d 503d 3131 3138 3633 3333 3238 3b0a  AMP=1118633328;.
0000160: 6372 6561 7465 2074 6162 6c65 2062 6c6f  create table blo
0000170: 6274 6573 7420 2862 2062 6c6f 6229 3b0a  btest (b blob);.
0000180: 2320 6174 2032 3131 310a 2330 3530 3631  # at 2111.#05061
0000190: 3320 2035 3a32 383a 3438 2073 6572 7665  3  5:28:48 serve
00001a0: 7220 6964 2033 3334 3130 2020 6c6f 675f  r id 33410  log_
00001b0: 706f 7320 3231 3131 2009 5175 6572 7909  pos 2111 .Query.
00001c0: 7468 7265 6164 5f69 643d 3236 0965 7865  thread_id=26.exe
00001d0: 635f 7469 6d65 3d30 0965 7272 6f72 5f63  c_time=0.error_c
00001e0: 6f64 653d 300a 5345 5420 5449 4d45 5354  ode=0.SET TIMEST
00001f0: 414d 503d 3131 3138 3633 3333 3238 3b0a  AMP=1118633328;.
0000200: 494e 5345 5254 2049 4e54 4f20 626c 6f62  INSERT INTO blob
0000210: 7465 7374 2028 6229 2056 414c 5545 5320  test (b) VALUES
0000220: 2827 835c 3027 293b 0a                   ('.\0');.
[tsmith@devsrv-a t]$ mysqlbinlog --start-position=1982 ../data/*-bin.000002 | mysql test
[tsmith@devsrv-a t]$ mysql -e 'select hex(b) from blobtest' test
+--------+
| hex(b) |
+--------+
| 8300   |
+--------+
[tsmith@devsrv-a t]$ mysqlbinlog --start-position=1982 ../data/*-bin.000002 | mysql --default-character-set=cp932 test
[tsmith@devsrv-a t]$ mysql -e 'select hex(b) from blobtest' test
+--------+
| hex(b) |
+--------+
| 835C30 |
+--------+

Suggested fix:

Log using HEX notation, so that it is never affected by character set, if it is some binary data type.

Otherwise, the parser interprets the constant as a character string and tries to unescape it.  So, the result depends on the current client character set.  For blob values it should definitely not happen; the result should be character set independent.

HEX logging will fix the problem.
[29 Jun 2005 1:09] Timothy Smith
I just tested this against the latest MySQL 5.0 code from BK.  The bug happens there as well.  I used a java test program this time, to ensure that I am doing the exact same steps as the reporting user.

A summary of my results is:

The behavior in MySQL 5.0 is identical to that of MySQL 4.1.  In both cases, it is not enough to use the same character set with the server and the client.

The problem comes because, in cp932, characters between 0x81 and 0x9f introduce a multi-byte sequence.  So when the character string '?\0' (where ? == 0x83) is read in
the cp932 character set, ?\ becomes one character, and 0 becomes a second.  The '\0' escape sequence is never seen.

This problem could affect any multi-byte character set.

Now, here is my testing.

I have added this to my [mysqld] section:

character_set_server = cp932

Then I run the following Java program (it does the same basically as my C program, but it sets the character set in the URL):

=============================================
12:57 ~/m/50/m/t$ mysql --default-character-set=cp932 -e 'drop table if exists btable; create table btable (f1 blob); show create table btable\G' test
*************************** 1. row ***************************
       Table: btable
Create Table: CREATE TABLE `btable` (
  `f1` blob
) ENGINE=InnoDB DEFAULT CHARSET=cp932
12:57 ~/m/50/m/t$ mysql -e 'show master status;'
+-----------------+----------+--------------+------------------+
| File            | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-----------------+----------+--------------+------------------+
| siva-bin.000001 |     1144 |              |                  |
+-----------------+----------+--------------+------------------+
12:57 ~/m/50/m/t$ java -classpath $HOME/m/java/mysql-connector-java-3.1.10/mysql-connector-java-3.1.10-bin.jar:. JDBC2
12:57 ~/m/50/m/t$ mysql --default-character-set=cp932 -e 'select hex(f1) from btable' test                            
+---------+
| hex(f1) |
+---------+
| 8300    |
+---------+
12:58 ~/m/50/m/t$ mysqlbinlog --start-position=1144 ../data/siva-bin.000001   
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
# at 1144
#050629 12:57:57 server id 33500  end_log_pos 95        Query   thread_id=11    exec_time=0     error_code=0
use test;
SET TIMESTAMP=1120006677;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.unique_checks=1;
SET @@session.sql_mode=0;
SET @@session.character_set_client=95,@@session.collation_connection=95,@@session.collation_server=95;
INSERT INTO btable VALUES('�\0');
# at 1239
#050629 12:57:57 server id 33500  end_log_pos 1266      Xid = 61
COMMIT;
ROLLBACK;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
12:58 ~/m/50/m/t$ mysqlbinlog --start-position=1144 ../data/siva-bin.000001 | grep INSERT | xxd
0000000: 494e 5345 5254 2049 4e54 4f20 6274 6162  INSERT INTO btab
0000010: 6c65 2056 414c 5545 5328 2783 5c30 2729  le VALUES('.\0')
0000020: 3b0a                                     ;.
12:58 ~/m/50/m/t$ mysqlbinlog --start-position=1144 ../data/siva-bin.000001 | mysql --default-character-set=cp932 test
12:59 ~/m/50/m/t$ mysql --default-character-set=cp932 -e 'select hex(f1) from btable' test
+---------+
| hex(f1) |
+---------+
| 8300    |
| 835C30  |
+---------+
12:59 ~/m/50/m/t$  mysql --default-character-set=cp932 -e 'show variables like "version%"'
+-------------------------+----------------------+
| Variable_name           | Value                |
+-------------------------+----------------------+
| version                 | 5.0.9-beta-debug-log |
| version_comment         | Source distribution  |
| version_compile_machine | i386                 |
| version_compile_os      | unknown-freebsd5.4   |
+-------------------------+----------------------+

=============================================

Here is the Java test program, called JDBC2.java:

=============================================
import java.sql.*;

public class JDBC2 {

        public static void main(String[] args) throws Exception{
                Connection conn = null;
                PreparedStatement pstmt = null;
                ResultSet rs = null;
                Statement stmt;
        
                Class.forName("com.mysql.jdbc.Driver");

                conn = DriverManager.getConnection(
                        "jdbc:mysql://127.0.0.1:33500/test?" +
                        "user=root&password=&characterEncoding=CP932");
                stmt = conn.createStatement();

                        byte[] data = new byte[2];
                data[0] = (byte)0x83;
                data[1] = (byte)0x00;           
                
                pstmt = conn.prepareStatement("INSERT INTO btable VALUES(?)");
                pstmt.setBytes(1,data);
                pstmt.executeUpdate();
        }
}

=============================================
[25 Jul 2005 21:08] Elliot Murphy
Per discussion with Bar, unsafe bytes are codes less than 0x20, more than 0x7F, also 0x22 (quot), 0x27(apostroph), 0x5C(slash). Other bytes should be safe.
[17 Aug 2005 8:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/28372
[17 Aug 2005 8:31] Elliot Murphy
Reviewed by Konstantin, fix also discussed extensively with Bar.
[17 Aug 2005 10:20] Elliot Murphy
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

The fix was to encode parameters to prepared statements in hex before inserting into binlog when cp932 is used as the client charset.

This fix has been pushed for 4.1.14, will set to documenting when I have merged to 5.0
[18 Aug 2005 18:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/28472
[20 Aug 2005 4:23] Elliot Murphy
The fix was to hex encode the parameters to a prepared statement before writing them into the binary log if the client was connected using cp932.
This fix went into 4.1.14 and has been pushed for 5.0.12.
[24 Aug 2005 17:58] Mike Hillyer
Documented in 4.1.14 and 5.0.12 changelogs:

<listitem><para>Prepared statement parameters could cause errors in the binary log if the character set was <literal>cp932</literal>. (Bug #11338)</para></listitem>