Bug #39753 Replication failure on MIXED + bit + myisam + no PK
Submitted: 30 Sep 2008 10:31 Modified: 18 Mar 2009 15:19
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S2 (Serious)
Version:5.1, 6.0-bzr OS:Any
Assigned to: Luis Soares CPU Architecture:Any

[30 Sep 2008 10:31] Philip Stoev
Description:
Simple statements against a bit column cause failure in mixed-mode replication:

Could not execute Delete_rows event on table test.table0_myisam; Can't find record in 'table0_myisam', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 736

How to repeat:
This example uses two bit columns, however only one un-indexed bit column is required to reproduce the bug -- just the test case is a bit longer.

--source include/master-slave.inc

SET GLOBAL BINLOG_FORMAT=MIXED;
DROP TABLE IF EXISTS table0_myisam;
CREATE TABLE table0_myisam (`bit_key` bit, `bit` bit, key (`bit_key` )) ENGINE=myisam;
INSERT INTO `table0_myisam` ( `bit` ) VALUES ( 0 );
UPDATE `table0_myisam` SET `bit_key` = 0 WHERE `bit_key` = 1 LIMIT 3;
DELETE FROM `table0_myisam` WHERE `bit` < 2 LIMIT 4;

--save_master_pos
--connection slave
--sync_with_master
[1 Oct 2008 14:57] Valeriy Kravchuk
Verified also with latest 6.0.8 from bzr:

openxs@suse:/home2/openxs/dbs/6.0/mysql-test> ./mysql-test-run.pl --suite=rpl rpl_bug39753
Logging: ./mysql-test-run.pl --suite=rpl rpl_bug39753
MySQL Version 6.0.8
Using dynamic switching of binlog format
Using ndbcluster when necessary, mysqld supports it
Setting mysqld to support SSL connections
Binaries are debug compiled
Using MTR_BUILD_THREAD      = 0
Using MASTER_MYPORT         = 9306
Using MASTER_MYPORT1        = 9307
Using SLAVE_MYPORT          = 9308
Using SLAVE_MYPORT1         = 9309
Using SLAVE_MYPORT2         = 9310
Using NDBCLUSTER_PORT       = 9311
Using NDBCLUSTER_PORT_SLAVE = 9312
 - adding combinations
Killing Possible Leftover Processes
Removing Stale Files
Creating Directories
Installing Master Database
Installing Slave1 Database
=======================================================

TEST                           RESULT         TIME (ms)
-------------------------------------------------------

rpl.rpl_bug39753 'row'         [ fail ]

=== SHOW MASTER STATUS ===
---- 1. ----
File    slave-bin.000001
Position        687
Binlog_Do_DB
Binlog_Ignore_DB
==========================

=== SHOW SLAVE STATUS ===
---- 1. ----
Slave_IO_State  Waiting for master to send event
Master_Host     127.0.0.1
Master_User     root
Master_Port     9306
Connect_Retry   1
Master_Log_File master-bin.000001
Read_Master_Log_Pos     802
Relay_Log_File  slave-relay-bin.000002
Relay_Log_Pos   724
Relay_Master_Log_File   master-bin.000001
Slave_IO_Running        Yes
Slave_SQL_Running       No
Replicate_Do_DB
Replicate_Ignore_DB
Replicate_Do_Table
Replicate_Ignore_Table
Replicate_Wild_Do_Table
Replicate_Wild_Ignore_Table
Last_Errno      1032
Last_Error      Could not execute Delete_rows event on table test.table0_myisam; Can't find record in 'table0_myisam', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 733
Skip_Counter    0
Exec_Master_Log_Pos     577
Relay_Log_Space 1105
Until_Condition None
Until_Log_File
Until_Log_Pos   0
Master_SSL_Allowed      No
Master_SSL_CA_File
Master_SSL_CA_Path
Master_SSL_Cert
Master_SSL_Cipher
Master_SSL_Key
Seconds_Behind_Master
Master_SSL_Verify_Server_Cert   No
Last_IO_Errno   0
Last_IO_Error
Last_SQL_Errno  1032
Last_SQL_Error  Could not execute Delete_rows event on table test.table0_myisam; Can't find record in 'table0_myisam', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 733
=========================

mysqltest: At line 12: could not sync with master ('select master_pos_wait('master-bin.000001', 802)' returned NULL)
[1 Oct 2008 16:12] Valeriy Kravchuk
Does NOT fail on today's 5.1-bzr, sorry.
[16 Oct 2008 16:35] Mats Kindahl
This failure is caused for UPDATE and DELETE statements that searches for rows where part of the key has NULL fields. In this case, range search should be used, but since a normal index search is used, it will fail to find the row to update. Observe that the update statement in the use case does not update any rows and hence is not logged.

The solution is to add code to select an index that does not have any fields in the before image with NULL values, or use range search as a last resort if there is no such index available.
[19 Nov 2008 20:01] van pham
I'm receiving the following errors even after upgrading to 5.1.29:

Last_SQL_Errno: 1032
               Last_SQL_Error: Could not execute Delete_rows event on table openfire.jivePresence; Can't find record in 'jivePresence', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master.000034, end_log_pos 252606019
1 row in set (0.00 sec)

I have set the log formayt to statement but I'm still getting these errors ? What is the solution to this problem ?
[27 Nov 2008 11:34] Mats Kindahl
After further investigation, we have discovered that this bug is caused because the InnoDB storage engine is updating the contents of the field in the record even though the field is NULL. This causes record mismatch when searching for the correct row since the reference record (and the default record for the table) contain 0x00 and not 0x20.

No engine should update the field in the record unless:
* the read bit for the field is set, and
* the value of the field is non-NULL, in which case the field shall be set
  to the value of the field.
[27 Nov 2008 15:12] Mats Kindahl
Statement-based replication cannot trigger this bug. Please make sure that you are not using an old binary log containing row-based events.
[20 Dec 2008 18:30] Luis Soares
After further investigation, the bug seems to occur in latest 5.1 tree, when replicating with mixed mode and using myisam storage engine.

Taking into consideration the use case reported in initially, we have found that (in mixed mode replication):

  1. the insert into "table0_myisam" is replicated using a statement event;
  2. the delete from "table0_myisam" is replicated using row event;

This means that the two events reach the storage engine from two distinct execution paths and, more relevant, that the row event generated by the delete is handled at Rows_log_event::find_row. Inside this method the record is:

  1. prepared using the prepare_record function;
  2. and then, unpacked, from packed format into storage engine friendly format, using unpack_current_row method.

After discussing this with Mats, we have noticed that the prepare_record function seems to be messing with some don't care bits in the record header. This interferes with setting the record's header default value and feedbacks into the unpack_current_row function causing it to output an unexpected value.

As a consequence, searching for that specific record in the storage engine will always fail, no matter if using index or range search. In detail, for every record fetched from the storage engine, the record_compare function, in find_row, will never succeed. The mismatch between the stored record, inside myisam, and the expected record, wrongly computed from the delete row event, is the reason for the comparison to fail.

How does this affect STMT and RBR replication modes: (in 5.1)

  * Statement base replication is not affected.

  * Row based replication is not affected. The insert and delete row events are subject to the same prepare procedure, thence the don't care bits, in the row header, are stored and retrieved with the same value, causing the record_compare to succeed (See Rows_log_event::write_row and Rows_log_event::find_row).

A suggested fix would be to replace the empty_record for restore_record inside the prepare_record function.
[20 Dec 2008 18:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/62185

2765 Luis Soares	2008-12-20
      Bug #39753: Replication failure on MIXED + bit + myisam + no PK
      
      When using mixed mode the record values stored inside the storage
      engine differed from the ones computed from the row event. This
      happened because the prepare_record function was calling
      empty_record macro causing some don't care bits to be left set.
      
      Replacing the empty_record with restore_record to restore the
      record default values fixes this (as suggested by Mats).
[10 Feb 2009 11:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65725

2768 Luis Soares	2009-02-10
      Bug #39753: Replication failure on MIXED + bit + myisam + no PK
            
      When using mixed mode the record values stored inside the storage
      engine differed from the ones computed from the row event. This
      happened because the prepare_record function was calling
      empty_record macro causing some don't care bits to be left set.
            
      Replacing the empty_record with restore_record to restore the
      record default values fixes this.
[17 Feb 2009 9:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/66589

2768 Luis Soares	2009-02-17
      Bug #39753: Replication failure on MIXED + bit + myisam + no PK
                  
      When using mixed mode the record values stored inside the storage
      engine differed from the ones computed from the row event. This
      happened because the prepare_record function was calling
      empty_record macro causing some don't care bits to be left set.
                  
      Replacing the empty_record plus explicitly setting defaults with 
      restore_record to restore the record default values fixes this.
      added:
        mysql-test/suite/rpl/r/rpl_mixed_bit_pk.result
        mysql-test/suite/rpl/t/rpl_mixed_bit_pk.test
      modified:
        sql/rpl_record.cc

=== added file 'mysql-test/suite/rpl/r/rpl_mixed_bit_pk.result'
--- a/mysql-test/suite/rpl/r/rpl_mixed_bit_pk.result	1970-01-01 00:00:00 +0000
+++ b/mysql-test/suite/rpl/r/rpl_mixed_bit_pk.result	2009-02-17 09:05:41 +0000
@@ -0,0 +1,32 @@
+stop slave;
+drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
+reset master;
+reset slave;
+drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
+start slave;
+DROP TABLE IF EXISTS t1,t2,t3,t4,t5,t6,t7,t8;
+CREATE TABLE t1 (`bit_key` bit, `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t2 (`bit_key` bit(4), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t3 (`bit_key` bit(7), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t4 (`bit_key` bit(8), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t5 (`bit_key` bit(9), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t6 (`bit_key` bit(14), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t7 (`bit_key` bit(15), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t8 (`bit_key` bit(16), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+INSERT INTO `t1` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t1` WHERE `bit` < 2 LIMIT 4;
+INSERT INTO `t2` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t2` WHERE `bit` < 2 LIMIT 4;
+INSERT INTO `t3` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t3` WHERE `bit` < 2 LIMIT 4;
+INSERT INTO `t4` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t4` WHERE `bit` < 2 LIMIT 4;
+INSERT INTO `t5` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t5` WHERE `bit` < 2 LIMIT 4;
+INSERT INTO `t6` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t6` WHERE `bit` < 2 LIMIT 4;
+INSERT INTO `t7` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t7` WHERE `bit` < 2 LIMIT 4;
+INSERT INTO `t8` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t8` WHERE `bit` < 2 LIMIT 4;
+DROP TABLE t1, t2, t3, t4, t5, t6, t7, t8;

=== added file 'mysql-test/suite/rpl/t/rpl_mixed_bit_pk.test'
--- a/mysql-test/suite/rpl/t/rpl_mixed_bit_pk.test	1970-01-01 00:00:00 +0000
+++ b/mysql-test/suite/rpl/t/rpl_mixed_bit_pk.test	2009-02-17 09:05:41 +0000
@@ -0,0 +1,78 @@
+#
+# BUG
+# ---
+#  BUG#39753: Replication failure on MIXED + bit + myisam + no PK
+#
+# Description
+# -----------
+#  Simple statements against a bit column cause failure in mixed-mode
+#  replication.
+#
+#  Implementation is as follows:
+#     i) A table with two bit fields is created. One of them is a key.
+#    ii) A record is inserted without specifying the key value.
+#   iii) The record is deleted using a where clause that matches it.
+#    iv) repeat i-iii) for bit key that has different size, generating
+#        different extra bits values 
+#     v) The slave is synchronized with master 
+#    vi) The table is dropped on master and the slave is re-synchronized
+#        with master.
+#
+#  Step v) made the bug evident before the patch, as the slave would 
+#  fail to find the correspondent row in its database (although it did
+#  the insert in step ii) ).
+#
+# Obs
+# --- 
+#  This test is based on the "how to repeat" section from the bug report.
+#
+#
+
+--source include/master-slave.inc
+
+--disable_warnings
+# setup
+
+DROP TABLE IF EXISTS t1,t2,t3,t4,t5,t6,t7,t8;
+CREATE TABLE t1 (`bit_key` bit, `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t2 (`bit_key` bit(4), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t3 (`bit_key` bit(7), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t4 (`bit_key` bit(8), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t5 (`bit_key` bit(9), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t6 (`bit_key` bit(14), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t7 (`bit_key` bit(15), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+CREATE TABLE t8 (`bit_key` bit(16), `bit` bit, key (`bit_key` )) ENGINE=MyISAM;
+
+# insert and delete
+INSERT INTO `t1` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t1` WHERE `bit` < 2 LIMIT 4;
+
+INSERT INTO `t2` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t2` WHERE `bit` < 2 LIMIT 4;
+
+INSERT INTO `t3` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t3` WHERE `bit` < 2 LIMIT 4;
+
+INSERT INTO `t4` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t4` WHERE `bit` < 2 LIMIT 4;
+
+INSERT INTO `t5` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t5` WHERE `bit` < 2 LIMIT 4;
+
+INSERT INTO `t6` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t6` WHERE `bit` < 2 LIMIT 4;
+
+INSERT INTO `t7` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t7` WHERE `bit` < 2 LIMIT 4;
+
+INSERT INTO `t8` ( `bit` ) VALUES ( 0 );
+DELETE FROM `t8` WHERE `bit` < 2 LIMIT 4;
+
+
+--enable_warnings
+sync_slave_with_master;
+
+# clean up 
+connection master;
+DROP TABLE t1, t2, t3, t4, t5, t6, t7, t8;
+sync_slave_with_master;

=== modified file 'sql/rpl_record.cc'
--- a/sql/rpl_record.cc	2008-02-05 13:52:20 +0000
+++ b/sql/rpl_record.cc	2009-02-17 09:05:41 +0000
@@ -297,21 +297,16 @@ unpack_row(Relay_log_info const *rli,
 /**
   Fills @c table->record[0] with default values.
 
-  First @c empty_record() is called and then, additionally, fields are
-  initialized explicitly with a call to @c set_default().
-
-  For optimization reasons, the explicit initialization can be skipped for
-  first @c skip fields. This is useful if later we are going to fill these 
-  fields from other source (e.g. from a Rows replication event).
-
-  If @c check is true, fields are explicitly initialized only if they have
-  default value or can be NULL. Otherwise error is reported.
+  First @c restore_record() is called to restore the default values for
+  record concerning the given table. Then, if @c check is true, 
+  a check is performed to see if fields are have default value or can 
+  be NULL. Otherwise error is reported.
  
   @param table  Table whose record[0] buffer is prepared. 
-  @param skip   Number of columns for which default value initialization 
+  @param skip   Number of columns for which default/nullable check 
                 should be skipped.
-  @param check  Indicates if errors should be checked when setting default
-                values.
+  @param check  Indicates if errors should be raised when checking 
+                default/nullable field properties.
                 
   @returns 0 on success or a handler level error code
  */ 
@@ -321,25 +316,28 @@ int prepare_record(TABLE *const table, 
   DBUG_ENTER("prepare_record");
 
   int error= 0;
-  empty_record(table);
+  restore_record(table, s->default_values);
 
-  if (skip >= table->s->fields)  // nothing to do
+  /*
+     This skip should be revisited in 6.0, because in 6.0 RBR one 
+     can have holes in the row (as the grain of the writeset is 
+     the column and not the entire row).
+   */
+  if (skip >= table->s->fields || !check)
     DBUG_RETURN(0);
 
-  /* Explicit initialization of fields */
+  /* Checking if exists default/nullable fields in the default values. */
 
   for (Field **field_ptr= table->field+skip ; *field_ptr ; ++field_ptr)
   {
     uint32 const mask= NOT_NULL_FLAG | NO_DEFAULT_VALUE_FLAG;
     Field *const f= *field_ptr;
 
-    if (check && ((f->flags & mask) == mask))
+    if (((f->flags & mask) == mask))
     {
       my_error(ER_NO_DEFAULT_FOR_FIELD, MYF(0), f->field_name);
       error = HA_ERR_ROWS_EVENT_APPLY;
     }
-    else
-      f->set_default();
   }
 
   DBUG_RETURN(error);

-- 
MySQL Code Commits Mailing List
For list archives: http://lists.mysql.com/commits
To unsubscribe:    http://lists.mysql.com/commits?unsub=commits@bugs.mysql.com
[5 Mar 2009 21:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/68426

3110 Luis Soares	2009-03-05 [merge]
      BUG#39753: Replication failure on MIXED + bit + myisam + no PK
      
      up-merge: 5.1-bt --> 6.0-bt
[5 Mar 2009 21:20] Luis Soares
Pushed to bugteam trees.
[13 Mar 2009 19:05] Bugs System
Pushed into 5.1.33 (revid:joro@sun.com-20090313111355-7bsi1hgkvrg8pdds) (version source revid:zhou.li@sun.com-20090311061050-ihp0g77znonq1tuq) (merge vers: 5.1.33) (pib:6)
[15 Mar 2009 9:09] Jon Stephens
Documented bugfix in the 5.1.33 changelog as follows:

        When using the MIXED replication format, UPDATE and DELETE
        statements that searched for rows where part of the key had
        nullable BIT columns failed. This occurred because operations
        that inserted the data were replicated as statements, but UPDATE
        and DELETE statements affecting the same data were replicated
        using row-based format.

        This issue did not occur when using statement-based replication
        (only) or row-based replication (only).

        See also Bug #39648.

Set NDI status pending merge to 6.0 tree.
[18 Mar 2009 13:17] Bugs System
Pushed into 6.0.11-alpha (revid:joro@sun.com-20090318122208-1b5kvg6zeb4hxwp9) (version source revid:matthias.leich@sun.com-20090310140952-gwtoq87wykhji3zi) (merge vers: 6.0.11-alpha) (pib:6)
[18 Mar 2009 15:19] Jon Stephens
Fix also documented in the 6.0.11 changelog; closed.
[9 May 2009 16:44] Bugs System
Pushed into 5.1.34-ndb-6.2.18 (revid:jonas@mysql.com-20090508185236-p9b3as7qyauybefl) (version source revid:jonas@mysql.com-20090508100057-30ote4xggi4nq14v) (merge vers: 5.1.33-ndb-6.2.18) (pib:6)
[9 May 2009 17:41] Bugs System
Pushed into 5.1.34-ndb-6.3.25 (revid:jonas@mysql.com-20090509063138-1u3q3v09wnn2txyt) (version source revid:jonas@mysql.com-20090508175813-s6yele2z3oh6o99z) (merge vers: 5.1.33-ndb-6.3.25) (pib:6)
[9 May 2009 18:38] Bugs System
Pushed into 5.1.34-ndb-7.0.6 (revid:jonas@mysql.com-20090509154927-im9a7g846c6u1hzc) (version source revid:jonas@mysql.com-20090509073226-09bljakh9eppogec) (merge vers: 5.1.33-ndb-7.0.6) (pib:6)