Bug #37426 | RBR breaks for CHAR() UTF8 fields > 85 chars | ||
---|---|---|---|
Submitted: | 16 Jun 2008 10:08 | Modified: | 4 Nov 2008 13:25 |
Reporter: | Jan Kneschke | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Row Based Replication ( RBR ) | Severity: | S3 (Non-critical) |
Version: | 5.1, 6.0 | OS: | Any |
Assigned to: | Mats Kindahl | CPU Architecture: | Any |
[16 Jun 2008 10:08]
Jan Kneschke
[16 Jun 2008 10:09]
Jan Kneschke
This is 5.1 version of Bug#32462
[20 Jun 2008 12:59]
Susanne Ebrecht
Verified as described by using above test and MySQL 5.1 bzr tree from yesterday. Master output: mysql> SELECT * FROM char128_utf8; +----+-----+----+ | i1 | c | i2 | +----+-----+----+ | 1 | 123 | 1 | +----+-----+----+ Slave output: mysql> SELECT * FROM char128_utf8; +------------+---------------------------------------------------------------------------------------------------+-----+ | i1 | c | i2 | +------------+---------------------------------------------------------------------------------------------------+-----+ | 1 | 12 | 307 | | 1751318528 | r128_utf8 MyISAM | 0 | +------------+---------------------------------------------------------------------------------------------------+-----+ mysql> SELECT * FROM char128_utf8\G *************************** 1. row *************************** i1: 1 c: i2: 307 *************************** 2. row *************************** i1: 1751318528 c: r128_utf8 i2: 0
[20 Jun 2008 15:40]
Susanne Ebrecht
Output from slave by using same test on MySQL 6.0 bzr tree: SELECT * FROM char128_utf8\G *************************** 1. row *************************** i1: 1 c: i2: 307 *************************** 2. row *************************** i1: 1751318528 c: r128_utf8 i2: 0
[24 Jun 2008 6:45]
Susanne Ebrecht
This behaviour only occurs by using data type 'char'. Other string data types are not effected.
[25 Jun 2008 21:04]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/48537 2667 Mats Kindahl 2008-06-25 BUG#37426: RBR breaks for CHAR() UTF-8 fields > 85 chars In order to handle CHAR() fields, 8 bits were reserved for the size of the CHAR field. However, instead of denoting the number of characters in the field, field_length was used which denotes the number of bytes in the field. Since UTF-8 fields can have three bytes per character (and has been extended to have four bytes per character in 6.0), an extra two bits have been encoded in the field metadata work for fields of type Field_string (i.e., CHAR fields). Since the metadata word is filled, the extra bits have been encoded in the upper 4 bits of the real type (the most significant byte of the metadata word) by computing the bitwise xor of the extra two bits. Since the upper 4 bits of the real type always is 1111 for Field_string, this means that for fields of length <256, the encoding is identical to the encoding used in pre-5.1.26 servers, but for lengths of 256 or more, an unrecognized type is formed, causing an old slave (that does not handle lengths of 256 or more) to stop.
[30 Jun 2008 9:38]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/48724 2667 Mats Kindahl 2008-06-30 BUG#37426: RBR breaks for CHAR() UTF-8 fields > 85 chars In order to handle CHAR() fields, 8 bits were reserved for the size of the CHAR field. However, instead of denoting the number of characters in the field, field_length was used which denotes the number of bytes in the field. Since UTF-8 fields can have three bytes per character (and has been extended to have four bytes per character in 6.0), an extra two bits have been encoded in the field metadata work for fields of type Field_string (i.e., CHAR fields). Since the metadata word is filled, the extra bits have been encoded in the upper 4 bits of the real type (the most significant byte of the metadata word) by computing the bitwise xor of the extra two bits. Since the upper 4 bits of the real type always is 1111 for Field_string, this means that for fields of length <256, the encoding is identical to the encoding used in pre-5.1.26 servers, but for lengths of 256 or more, an unrecognized type is formed, causing an old slave (that does not handle lengths of 256 or more) to stop.
[30 Jun 2008 19:06]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/48751 2667 Mats Kindahl 2008-06-30 BUG#37426: RBR breaks for CHAR() UTF-8 fields > 85 chars In order to handle CHAR() fields, 8 bits were reserved for the size of the CHAR field. However, instead of denoting the number of characters in the field, field_length was used which denotes the number of bytes in the field. Since UTF-8 fields can have three bytes per character (and has been extended to have four bytes per character in 6.0), an extra two bits have been encoded in the field metadata work for fields of type Field_string (i.e., CHAR fields). Since the metadata word is filled, the extra bits have been encoded in the upper 4 bits of the real type (the most significant byte of the metadata word) by computing the bitwise xor of the extra two bits. Since the upper 4 bits of the real type always is 1111 for Field_string, this means that for fields of length <256, the encoding is identical to the encoding used in pre-5.1.26 servers, but for lengths of 256 or more, an unrecognized type is formed, causing an old slave (that does not handle lengths of 256 or more) to stop.
[30 Jun 2008 20:11]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/48757 2662 Mats Kindahl 2008-06-30 BUG#37426: RBR breaks for CHAR() UTF-8 fields > 85 chars In order to handle CHAR() fields, 8 bits were reserved for the size of the CHAR field. However, instead of denoting the number of characters in the field, field_length was used which denotes the number of bytes in the field. Since UTF-8 fields can have three bytes per character (and has been extended to have four bytes per character in 6.0), an extra two bits have been encoded in the field metadata work for fields of type Field_string (i.e., CHAR fields). Since the metadata word is filled, the extra bits have been encoded in the upper 4 bits of the real type (the most significant byte of the metadata word) by computing the bitwise xor of the extra two bits. Since the upper 4 bits of the real type always is 1111 for Field_string, this means that for fields of length <256, the encoding is identical to the encoding used in pre-5.1.26 servers, but for lengths of 256 or more, an unrecognized type is formed, causing an old slave (that does not handle lengths of 256 or more) to stop.
[30 Jun 2008 20:31]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/48759 2662 Mats Kindahl 2008-06-30 BUG#37426: RBR breaks for CHAR() UTF-8 fields > 85 chars In order to handle CHAR() fields, 8 bits were reserved for the size of the CHAR field. However, instead of denoting the number of characters in the field, field_length was used which denotes the number of bytes in the field. Since UTF-8 fields can have three bytes per character (and has been extended to have four bytes per character in 6.0), an extra two bits have been encoded in the field metadata work for fields of type Field_string (i.e., CHAR fields). Since the metadata word is filled, the extra bits have been encoded in the upper 4 bits of the real type (the most significant byte of the metadata word) by computing the bitwise xor of the extra two bits. Since the upper 4 bits of the real type always is 1111 for Field_string, this means that for fields of length <256, the encoding is identical to the encoding used in pre-5.1.26 servers, but for lengths of 256 or more, an unrecognized type is formed, causing an old slave (that does not handle lengths of 256 or more) to stop.
[30 Jun 2008 20:32]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/48760 2662 Mats Kindahl 2008-06-30 BUG#37426: RBR breaks for CHAR() UTF-8 fields > 85 chars In order to handle CHAR() fields, 8 bits were reserved for the size of the CHAR field. However, instead of denoting the number of characters in the field, field_length was used which denotes the number of bytes in the field. Since UTF-8 fields can have three bytes per character (and has been extended to have four bytes per character in 6.0), an extra two bits have been encoded in the field metadata work for fields of type Field_string (i.e., CHAR fields). Since the metadata word is filled, the extra bits have been encoded in the upper 4 bits of the real type (the most significant byte of the metadata word) by computing the bitwise xor of the extra two bits. Since the upper 4 bits of the real type always is 1111 for Field_string, this means that for fields of length <256, the encoding is identical to the encoding used in pre-5.1.26 servers, but for lengths of 256 or more, an unrecognized type is formed, causing an old slave (that does not handle lengths of 256 or more) to stop.
[30 Jun 2008 20:53]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/48765 2663 Joerg Bruehe 2008-06-30 Version 5.1.26 is labeled "rc".
[10 Jul 2008 17:01]
Joerg Bruehe
This fix is included in 5.1.26-rc, and it is important enough (showstopper) so that we want it in the announcement. I do not know how soon it will be upmerged to 6.0, so please set it back to "patch queued" when it is documented for 5.1.26-rc.
[10 Jul 2008 17:27]
Paul DuBois
Noted in 5.1.26 changelog. Row-based replication broke for utf8 CHAR columns longer than 85 characters. Setting report to Patch queued pending push into other trees.
[21 Jul 2008 4:24]
Bugs System
Pushed into 5.1.26
[21 Jul 2008 9:15]
Jon Stephens
Fix is already documented for 5.1.26 - reset to Patch Pending status, waiting for 6.0 merge.
[23 Jul 2008 13:08]
Bugs System
Pushed into 6.0.7-alpha (revid:serg@mysql.com-20080722121106-wy84j0yvceyu72zr) (pib:2)
[28 Jul 2008 16:47]
Bugs System
Pushed into 5.1.28 (revid:joerg@mysql.com-20080711185110-l3t04xwds0ac6o1v) (version source revid:joerg@mysql.com-20080711185110-l3t04xwds0ac6o1v) (pib:3)
[25 Aug 2008 17:40]
Paul DuBois
Noted in 6.0.7 changelog.
[14 Sep 2008 4:41]
Bugs System
Pushed into 6.0.7-alpha (revid:mats@mysql.com-20080630201118-wr133h32lvbkxyk0) (version source revid:john.embretsen@sun.com-20080724122511-9c0oudz1xrdrs6y6) (pib:3)
[4 Nov 2008 13:25]
Jon Stephens
Hi Lars, 'Documenting' is for bugfixes that need to be documented by Docs (me). The Binlog format docs are Internals docs and are maintained by developers. Please either re-open this bug and set the bug category to Documentation *and* assign it to the developer who's responsible for updating the Binary Log portion of the Internals docs (and set yourself as lead) or open a new Docs bug and do these things with it. Setting a fixed and closed Replication bug to Documenting merely puts it back into my queue, which is not useful if my work with it has already been done, which AFAIK is the case here, since the user-facing issue has been resolved. Thanks!
[29 Oct 2010 7:49]
Luis Soares
Related: BUG#53386.