| Bug #49481 | RBR: MyISAM and bit fields may cause slave to stop on delete: cant find record | ||
|---|---|---|---|
| Submitted: | 6 Dec 2009 2:03 | Modified: | 15 Mar 2010 4:58 | 
| Reporter: | Luis Soares | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Row Based Replication ( RBR ) | Severity: | S3 (Non-critical) | 
| Version: | 5.1, next-mr | OS: | Any | 
| Assigned to: | Luis Soares | CPU Architecture: | Any | 
| Tags: | myisam, rbr field null | ||
   [6 Dec 2009 3:10]
   Luis Soares        
  MyISAM seems to store part of the BIT fields in the null
bits. This can raise problems when we are logging in row mode and
a MyISAM table is used. Consider the scenario presented in the
initial report (S# means statement number #):
  S1> CREATE TABLE t1 (c1 BIT, c2 INT);
  S2> INSERT INTO t1 VALUES (1, 1);
  S3> UPDATE t1 SET c1=NULL where c2=1;
  S4> DELETE FROM t1 WHERE c2=1 LIMIT 1;
Some notes:
  1. S4 will be logged as a Delete_rows_log_event.
  2. The null byte(s) of a MyISAM table contain part of a BIT
     field.
  3. Given, 2., we expect the bit field to be among the null
     byte(s).
Given that the DELETE is executed as a row event, the slave will:
  1. find the row
     This is done resorting to the Rows_log_event::find_row
     method. In it, the before image (BI) is unpacked and rows
     from the storage engine (SE) are fetched and
     compared. Fetching a row is done either by position (if a PK
     exists), by a key [scan] (if a key exists) or by a range
     scan.
     In both, range and key scan, a loop is performed and
     everytime a row is fetched from the SE it is compared
     against the row got from the BI. When a match is found, the
     slave successfully found the row to update/delete.
     Comparison can be done by a regular memcmp for the full
     rows (if no varchar or blob fields exist in the
     table). Otherwise, null bits are first matched with memcmp
     and if they match, a field by field comparison is then
     performed. The result of such comparisons determines the
     slave success on finding the desired row in the SE. This is
     done in the function log_event.cc:record_compare.
  2. if finding a row succeeded, the delete is performed by
     calling the handler ha_delete_row implementation.
Inspecting the slave execution (through a gdb session) during the
find_row method, I find, for the scenario in question, that the
values of the null byte are the following:
  unpacked before image: 0xF3  [11110011]
  row fetched from SE:   0XF7  [11110111]
Interpreting the bits (starting from the right - least
significant bits):
  1. X bit
  2. One null bit for field 'c1'
  3. One bit value for field 'c1'
  4. One null bit for field 'c2'
  5,6,7,8: filler bits
Note that there is a discrepancy in the null bits for the BI row
and the SE row. Although the third bit becomes a don't
care (stripped of meaning) when the null bit is set for that
field, it makes the row comparison to fail in find_row,
ultimately causing the slave to stop stating that it cannot find
the row to delete.
 
   [6 Dec 2009 4:50]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/92989 3247 Luis Soares 2009-12-06 BUG#49481: RBR: MyISAM and bit fields may cause slave to stop on delete: cant find record BUG#49482: RBR: Replication may break on deletes when MyISAM tables + char field are used When using MyISAM tables, despite the fact that the null bit is set for some fields, their old value is still in the row. This can cause the comparison of records to fail when the slave is doing an index or range scan. We fix this by avoiding memcmp for MyISAM tables when comparing records. Additionally, when comparing field by field, we first check if both fields are not null and if so, then we compare them. If just one field is null we return failure immediately. If both fields are null, we move on to the next field.
   [6 Dec 2009 4:57]
   Luis Soares        
  See also: BUG#49482.
   [7 Dec 2009 13:16]
   Sveta Smirnova        
  Verified as described.
   [14 Jan 2010 14:27]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/96903 3307 Luis Soares 2010-01-14 Fix for BUG#49481 and BUG#49482. BUG#49481: RBR: MyISAM and bit fields may cause slave to stop on delete: cant find record BUG#49482: RBR: Replication may break on deletes when MyISAM tables + char field are used When using MyISAM tables, despite the fact that the null bit is set for some fields, their old value is still in the row. This can cause the comparison of records to fail when the slave is doing an index or range scan. We fix this by avoiding memcmp for MyISAM tables when comparing records. Additionally, when comparing field by field, we first check if both fields are not null and if so, then we compare them. If just one field is null we return failure immediately. If both fields are null, we move on to the next field.
   [14 Jan 2010 15:09]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/96921 3829 Luis Soares 2010-01-14 [merge] manual merge: Fix for BUG#49481 and BUG#49482 (5.1-bugteam --> mysql-pe).
   [20 Jan 2010 0:11]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/97472 3320 Luis Soares 2010-01-20 BUG#49481: RBR: MyISAM and bit fields may cause slave to stop on delete: cant find record Some engines return data for the record. Despite the fact that the null bit is set for some fields, their old value may still in the row. This can happen when unpacking an AI from the binlog on top of a previous record in which a field is set to NULL, which previously contained a value. Ultimately, this may cause the comparison of records to fail when the slave is doing an index or range scan. We fix this by deploying a call to reset() for each field that is set to null while unpacking a row from the binary log. Furthermore, we also add mixed mode test case to cover the scenario where updating and setting a field to null through a Query event and later searching it through a rows event will succeed. Finally, we also change the reset() method, from Field_bit class, so that it takes into account bits stored among the null bits and not only the ones stored in the record. @ sql/field.h Changed reset so that it also clears the bits among the null_bits for the Field_bit class. @ sql/rpl_record.cc Resetting field after setting it to null when unpacking row.
   [20 Jan 2010 22:14]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/97661 3320 Luis Soares 2010-01-20 BUG#49481: RBR: MyISAM and bit fields may cause slave to stop on delete: cant find record Some engines return data for the record. Despite the fact that the null bit is set for some fields, their old value may still in the row. This can happen when unpacking an AI from the binlog on top of a previous record in which a field is set to NULL, which previously contained a value. Ultimately, this may cause the comparison of records to fail when the slave is doing an index or range scan. We fix this by deploying a call to reset() for each field that is set to null while unpacking a row from the binary log. Furthermore, we also add mixed mode test case to cover the scenario where updating and setting a field to null through a Query event and later searching it through a rows event will succeed. Finally, we also change the reset() method, from Field_bit class, so that it takes into account bits stored among the null bits and not only the ones stored in the record. @ sql/field.h Changed reset so that it also clears the bits among the null_bits for the Field_bit class. @ sql/rpl_record.cc Resetting field after setting it to null when unpacking row.
   [21 Jan 2010 17:20]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/97762 3320 Luis Soares 2010-01-21 BUG#49481: RBR: MyISAM and bit fields may cause slave to stop on delete: cant find record Some engines return data for the record. Despite the fact that the null bit is set for some fields, their old value may still in the row. This can happen when unpacking an AI from the binlog on top of a previous record in which a field is set to NULL, which previously contained a value. Ultimately, this may cause the comparison of records to fail when the slave is doing an index or range scan. We fix this by deploying a call to reset() for each field that is set to null while unpacking a row from the binary log. Furthermore, we also add mixed mode test case to cover the scenario where updating and setting a field to null through a Query event and later searching it through a rows event will succeed. Finally, we also change the reset() method, from Field_bit class, so that it takes into account bits stored among the null bits and not only the ones stored in the record. @ mysql-test/suite/rpl/t/rpl_set_null_innodb.test InnoDB test. @ mysql-test/suite/rpl/t/rpl_set_null_myisam.test MyISAM test. @ mysql-test/suite/rpl_ndb/t/rpl_ndb_set_null.test NDB test. @ sql/field.h Changed reset so that it also clears the bits among the null_bits for the Field_bit class. @ sql/rpl_record.cc Resetting field after setting it to null when unpacking row.
   [4 Feb 2010 10:20]
   Bugs System        
  Pushed into 5.1.44 (revid:joro@sun.com-20100204101444-2j32mhqroo0iiio6) (version source revid:luis.soares@sun.com-20100126085522-0cbr9g33pf7x1bei) (merge vers: 5.1.43) (pib:16)
   [5 Feb 2010 11:49]
   Bugs System        
  Pushed into mysql-next-mr (revid:alik@sun.com-20100204063540-9czpdmpixi3iw2yb) (version source revid:alik@sun.com-20100130212638-ai8t5v3u6647p6d2) (pib:16)
   [5 Feb 2010 11:55]
   Bugs System        
  Pushed into 6.0.14-alpha (revid:alik@sun.com-20100205113942-oqovjy0eoqbarn7i) (version source revid:alik@sun.com-20100204064210-ljwanqvrjs83s1gq) (merge vers: 6.0.14-alpha) (pib:16)
   [5 Feb 2010 12:01]
   Bugs System        
  Pushed into 5.5.2-m2 (revid:alik@sun.com-20100203172258-1n5dsotny40yufxw) (version source revid:alik@sun.com-20100130191336-i53i9wx67n81ridm) (merge vers: 5.5.2-m2) (pib:16)
   [10 Feb 2010 6:31]
   Jon Stephens        
  Documented bugfix in the 5.1.44, 5.5.2, and 6.0.14 changelogs, as follows:
      When using row-based replication, setting a BIT column of a MyISAM 
      table to NULL then trying to delete from the table caused the slave 
      to fail with the error -Can't find record in [table]-.
Closed.
 
   [10 Feb 2010 6:51]
   Jon Stephens        
  Documented together with BUG#49482. Changelog entry updated: When using row-based replication, setting a BIT or CHAR column of a MyISAM table to NULL then trying to delete from the table caused the slave to fail with the error -Can't find record in [table]-.
   [12 Mar 2010 14:15]
   Bugs System        
  Pushed into 5.1.44-ndb-7.0.14 (revid:jonas@mysql.com-20100312135944-t0z8s1da2orvl66x) (version source revid:jonas@mysql.com-20100312115609-woou0te4a6s4ae9y) (merge vers: 5.1.44-ndb-7.0.14) (pib:16)
   [12 Mar 2010 14:31]
   Bugs System        
  Pushed into 5.1.44-ndb-6.2.19 (revid:jonas@mysql.com-20100312134846-tuqhd9w3tv4xgl3d) (version source revid:jonas@mysql.com-20100312060623-mx6407w2vx76h3by) (merge vers: 5.1.44-ndb-6.2.19) (pib:16)
   [12 Mar 2010 14:47]
   Bugs System        
  Pushed into 5.1.44-ndb-6.3.33 (revid:jonas@mysql.com-20100312135724-xcw8vw2lu3mijrhn) (version source revid:jonas@mysql.com-20100312103652-snkltsd197l7q2yg) (merge vers: 5.1.44-ndb-6.3.33) (pib:16)
   [15 Mar 2010 4:58]
   Jon Stephens        
  No additional changelog entries required. Returning to Closed state.

Description: Replication in mixed and RBR modes can sometimes break when MyISAM tables with bit fields are in use. I found this in the following scenario: S1> CREATE TABLE t1 (bit b, a int); S2> INSERT INTO t1 VALUES (1, 1); S3> UPDATE t1 SET b=NULL where a=1; In S3 we change the bit field to be NULL instead of being set. Then, if we later try to delete the row from the table (mind the LIMIT 1, which causes switch to ROW in mixed mode): S4> DELETE FROM t1 WHERE c2=1 LIMIT 1; The slave will stop, stating that it can't find the record to be deleted: Last_SQL_Error Could not execute Delete_rows event on table test.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 537 NOTES ===== 1. I found this in: - tree: mysql-5.1-bugteam - revid: davi.arnaut@sun.com-20091204153658-09w7l2edgszt8qe7 2. Running the test case (see how to repeat) on mtr, reports MySQL version: - MySQL Version 5.1.42 3. This seems to happen with MyISAM only. How to repeat: 1. bzr clone -r revid:davi.arnaut@sun.com-20091204153658-09w7l2edgszt8qe7 $YOUR_LOCAL_MYSQL_REPO/mysql-5.1-bugteam 2. cd mysql-5.1-bugteam 3. ./BUILD/compile-pentium64-debug-max 4. cd mysql-test 5. create a test case with the following content: -- source include/master-slave.inc -- source include/have_binlog_format_mixed_or_row.inc -- connection master CREATE TABLE t1 (c1 BIT, c2 INT); INSERT INTO `t1` VALUES ( 1, 1 ); UPDATE t1 SET c1=NULL where c2=1; -- sync_slave_with_master -- let $diff_table_1=master:test.t1 -- let $diff_table_2=slave:test.t1 -- source include/diff_tables.inc -- connection master DELETE FROM t1 WHERE c2=1 LIMIT 1; -- sync_slave_with_master -- let $diff_table_1=master:test.t1 -- let $diff_table_2=slave:test.t1 -- source include/diff_tables.inc -- connection master DROP TABLE t1; -- sync_slave_with_master -- exit 6. ./mtr --mysqld=--default-storage-engine=myisam <your_test_file_name> This will cause the test case to fail with the reported error. If I run (note innodb as default storage engine): ./mtr --mysqld=--default-storage-engine=innodb <your_test_file_name> The test succeeds. Suggested fix: n/a