Bug #51716 Char column with utf16 character set gives wrong padding on slave
Submitted: 4 Mar 2010 11:36 Modified: 24 Mar 2010 17:21
Reporter: Victor Kirkebo Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.5.99-m3 (Celosia) OS:Any
Assigned to: Luis Soares CPU Architecture:Any
Tags: autoinc, character set, padding, utf16

[4 Mar 2010 11:36] Victor Kirkebo
Description:
This seems to happen only with RBR and MBR.
RBR:
----
Create a table (myisam or innodb) with a char column with character set utf16 for the column. Insert a row into the table. On slave char field of the replicated row has been padded with hex 0020 and 2020 values (instead of 0000).
MBR:
-----
Same as RBR but only if there is also an insert trigger present that does inserts on a table that contains an autoinc column.

How to repeat:
RBR:
----------------------------------------------
master:
=======
create table t1(f1 char(10) character set utf16);
insert into t1 values('abc');
select hex(f1) from t1;
+--------------+
| hex(f1)      |
+--------------+
| 006100620063 |
+--------------+

slave:
======
select hex(f1) from t1;
+----------------------------------------------------------------------------------+
| hex(f1)                                                                          |
+----------------------------------------------------------------------------------+
| 00610062006300200020002000200020002000202020202020202020202020202020202020202020 |
+----------------------------------------------------------------------------------+

MBR:
----------------------------------------------
master:
=======
create table t1(f1 char(10) character set utf16);
create table log(i1 INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (i1), f1 VARCHAR(100));
create trigger t1_ins after insert on t1 for each row insert into log (f1) values (new.f1);
insert into t1 values('abc');
select hex(f1) from t1;
+--------------+
| hex(f1)      |
+--------------+
| 006100620063 |
+--------------+

slave:
======
select hex(f1) from t1;
+----------------------------------------------------------------------------------+
| hex(f1)                                                                          |
+----------------------------------------------------------------------------------+
| 00610062006300200020002000200020002000202020202020202020202020202020202020202020 |
+----------------------------------------------------------------------------------+
[4 Mar 2010 17:56] MySQL Verification Team
mysql> use db1
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> select hex(f1) from t1;
+----------------------------------------------------------------------------------+
| hex(f1)                                                                          |
+----------------------------------------------------------------------------------+
| 00610062006300200020002000200020002000202020202020202020202020202020202020202020 |
+----------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> show variables like "%version%";
+-------------------------+---------------------+
| Variable_name           | Value               |
+-------------------------+---------------------+
| innodb_version          | 1.0.6               |
| protocol_version        | 10                  |
| slave_type_conversions  |                     |
| version                 | 5.5.99-m3-debug-log |
| version_comment         | Source distribution |
| version_compile_machine | x86_64              |
| version_compile_os      | unknown-linux-gnu   |
+-------------------------+---------------------+
7 rows in set (0.00 sec)

mysql>
[4 Mar 2010 17:58] MySQL Verification Team
Thank you for the bug report.
[10 Mar 2010 17:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/102922

3119 Luis Soares	2010-03-10
      Fix for BUG#51716 and BUG#51787.
      
      In BUG#51787 we were using the wrong charset to print out the
      data. We were using the field charset for the string that would
      hold the information. This caused the assertion, because the
      string length was not aligned with UTF32 bytes requirements for
      storage.
      
      We fix this by using &my_charset_latin1 in the string object
      instead of the field->charset(). As a side-effect, we needed to
      extend the show_sql_type interface so that it took the field
      charset is now passed as a parameter, so that one is able to
      calculate the correct field size.
      
      In BUG#51716 we had issues with Field_string::pack and
      Field_string::unpack. When packing, the length was incorrectly
      calculated. When unpacking, the padding the string would be
      padded with the wrong bytes (a few bytes less than it should).
      
      We fix this by resorting to charset abstractions (functions) that
      calculate the correct length when packing and pad correctly the
      string when unpacking.
[10 Mar 2010 17:54] Alexander Barkov
The patch http://lists.mysql.com/commits/102922 looks fine for me.

I found a small problems though:

It seems rpl_row_charset.test is missing these lines:

-- source include/have_utf16.inc
-- source include/have_utf32.inc
[10 Mar 2010 22:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/102945

3120 Luis Soares	2010-03-10
      Fix for BUG#51716 and BUG#51787: test case improvements.
      
      Split rpl_row_charset into:
      
        - rpl_row_utf16.
        - rpl_row_utf32.
      
      This way these tests can run independently if server supports
      either one of the charsets but not both.
      
      Cleaned up rpl_row_utf32 which had a spurious instruction:
      -- let $reset_slave_type_conversions= 0
[11 Mar 2010 7:59] Alexander Barkov
The post-fix http://lists.mysql.com/commits/102945 looks fine.
[11 Mar 2010 11:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/102982

3814 Luis Soares	2010-03-11 [merge]
      Fix for BUG#51716 and BUG#51787: merge to 6.0-codebase-bugfixing.
      
      Manually merged tests changes:
      
        - rpl_row_utf16
          Removed the empty INSERT! Because of BUG@49100, the test would
          fail.
      
        - rpl_row_utf32
          Removed the warnings from the result file. Maximum key size is
          1332 bytes in 6.0, thence we don't get warnings.
[12 Mar 2010 12:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/103082

3129 Luis Soares	2010-03-12
      BUG#51716 post push fix.
      
      There are two issues fixed here:
      
        1. We needed to update the result file, for some of 
           mysqlbinlog_* tests, because now the some padding chars
           are not output anymore.
      
        2. We needed to change the Field_string::pack so that
           for BINARY types the padding chars are not packed 
           (lengthsp will return full length for these types).
[12 Mar 2010 12:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/103085

3129 Luis Soares	2010-03-12
      BUG#51716 post push fix.
      
      There are two issues fixed here:
      
        1. We needed to update the result file, for some of 
           mysqlbinlog_* tests, because now the some padding chars
           are not output anymore.
      
        2. We needed to change the Field_string::pack so that
           for BINARY types the padding chars are not packed 
           (lengthsp will return full length for these types).
[12 Mar 2010 12:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/103086

3818 Luis Soares	2010-03-12 [merge]
      BUG#51716: automerge.
[12 Mar 2010 18:10] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100312180926-0emfjrj8e9xnvl8h) (version source revid:alik@sun.com-20100312180447-2r0ak22y13s05134) (merge vers: 6.0.14-alpha) (pib:16)
[12 Mar 2010 18:11] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100312180838-rk60kn38g0qwt78n) (version source revid:alik@sun.com-20100312180435-wk7nvsbfntfus5bu) (pib:16)
[12 Mar 2010 18:22] Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100312181131-0b7v8r2htpd9jz2a) (version source revid:alik@sun.com-20100312181131-0b7v8r2htpd9jz2a) (merge vers: 5.5.3-m3) (pib:16)
[15 Mar 2010 12:25] Jon Stephens
Documented bugfix in the 5.5.3 and 6.0.14 changelogs as follows:

        When using the row-based or mixed replication format, column
        values using the UTF16 character set on the master were padded
        incorrectly on the slave.

Closed.
[16 Mar 2010 13:37] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/103456

3821 Luis Soares	2010-03-16
      Patch for BUG#51716 had been merged in 6.0 codebase with bad
      result file.
      
      This patch fixes it.
[24 Mar 2010 8:14] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100324081249-yfwol7qtcek6dh7w) (version source revid:alik@sun.com-20100324081113-kc7x1iytnplww91u) (merge vers: 6.0.14-alpha) (pib:16)