Bug #47927 Replication fails when executing random queries in server
Submitted: 8 Oct 2009 23:49 Modified: 6 Dec 2009 4:36
Reporter: Hema Sridharan Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:mysql-5.1, mysql-6.0 OS:Any
Assigned to: Luis Soares CPU Architecture:Any

[8 Oct 2009 23:49] Hema Sridharan
Description:
When executing random queries against master and slave server using RQG load, replication fails by giving error HA_ERR_END_OF_FILE.
Execute test using grammar file replication.yy and data generator replication_single_engine.zz

How to repeat:
perl runall.pl --basedir=/export/home/tmp/mysql5.1/mysql-5.1/ --grammar=/export/home/tmp/RQG/mysql-test-extra-6.0/mysql-test/gentest/conf/replication.yy --gendata=/export/home/tmp/RQG/mysql-test-extra-6.0/mysql-test/gentest/conf/replication_single_engine.zz --rpl_mode=row

Gives the following result and fails

# 01:33:52 Slave has stopped with error: Could not execute Delete_rows event on table test.table1_int_autoinc; Can't find record in 'table1_int_autoinc', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master-bin.000023, end_log_pos 14886

Observed the failure in mysql-5.1 as well as mysql-6.0 tree
[8 Oct 2009 23:50] Hema Sridharan
Please see similar BUG#39753
[9 Oct 2009 2:07] MySQL Verification Team
I couldn't repeat with mysql-6.0 tree which exactly source tree have you tried?. Thanks in advance.
[9 Oct 2009 2:36] Hema Sridharan
I used mysql-6.0-backup and mysql-5.1 tree
[9 Oct 2009 11:33] MySQL Verification Team
Thank you for the feedback. I was able to repeat with mysql-5.1 ( I tried before with mysql-6.0):

1_int_autoinc', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master-bin.000018, end_log_pos 14974
# 08:29:02 Slave has stopped with error: Could not execute Delete_rows event on table test.table1_int_autoinc; Can't find record in 'table1_int_autoinc', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master-bin.000018, end_log_pos 14974
# 08:29:02 Slave has stopped with error: Could not execute Delete_rows event on table test.table1_int_autoinc; Can't find record in 'table1_int_autoinc', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master-bin.000018, end_log_pos 14974
# 08:29:02 Killing child process with pid 12598...
[9 Oct 2009 11:41] MySQL Verification Team
Thank you for the bug report. Verified as described on Suse 11.2 64-bit.
[15 Oct 2009 10:08] Luis Soares
See also: BUG#43667, BUG#40278.
[6 Dec 2009 4:36] Luis Soares
Analysis
========

  1. I tried to repeat the failure in mysql-5.1, and was able to do it:

# 16:12:54 Slave has stopped with error: Could not execute Delete_rows event on table test.table1_int_autoinc; Can't find record in 'table1_int_autoinc', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master-bin.000022, end_log_pos 14317

  2. table1_int_autoinc is created with default engine (MyISAM), therefore
     is a non-transactional table.

  3. Searching master binlog for position 14317, I find a delete
     rows event for table table1_int_autoinc, as expected on
     behalf of thread_id=13. 

  4. Searching the master query log file I find that there is only one
     delete on table1_auto_inc issued on behalf of thread 13:
  
     13 Query     DELETE FROM `table1_int_autoinc` WHERE `int` > 0 LIMIT 1

     This instruction is preceded by:

     13 Query     SET AUTOCOMMIT=OFF
     13 Query     START TRANSACTION

     Furthermore, while thread 13 transaction context, is valid,
     there are other threads/connections changing table
     table1_int_autoinc and with autocommit=ON.

Conclusion
==========

  1. This is the same behavior described in BUG#40278 in
     general and in particular well documented in BUG#43667.

  2. Given 1., this bug shall be marked as a duplicate of
     BUG#40278. WL#2687 is the fix for this, and is currently pushed
     into tree mysql-5.1-rep+3. 

  3. I have tested in mysql-5.1-rep+3 and this problem with table
     table1_int_autoinc does not show up anymore. However other
     issues were found (some similar to this one, but with different
     root cause). These are tracked and should be fixed in: BUG#49479,
     BUG#49481 and BUG#49482.