Bug #56470 Stop slave at middle of the transaction can cause fatal error
Submitted: 1 Sep 2010 19:28 Modified: 10 Oct 2010 21:55
Reporter: Serge Kozlov Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S1 (Critical)
Version:5.6.99-m5 OS:Any
Assigned to: CPU Architecture:Any
Tags: nuts, replication, slave, start, stop

[1 Sep 2010 19:28] Serge Kozlov
Description:
The slave is trying to stop at middle of the transaction with mixing of transaction and non transaction tables and for some cases it stops with fatal error because can't finish current group of events for non-transaction tables.
After that slave SQL thread reports various errors: "cannot parse relay log event entry", "Could not execute Delete_rows" etc.

The below is the piece of slave error log where the bug happened:

100901 22:03:24 [Note] Slave I/O thread exiting, read up to log 'master-bin.000001', position 2007312
100901 22:03:24 [Note] Error reading relay log event: slave SQL thread was killed
100901 22:03:24 [Note] Slave SQL thread initialized, starting replication in log 'master-bin.000001' at position 2007312, relay log './slave-relay-bin.000141' position: 151
100901 22:03:24 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:13000',replication started in log 'master-bin.000001' at position 2007312
100901 22:03:24 [Note] Slave I/O thread exiting, read up to log 'master-bin.000001', position 2191653
100901 22:03:24 [Warning] Slave SQL: slave SQL thread is being stopped in the middle of applying of a group having updated a non-transaction table; waiting for the group completion ... , Error_code: 0
100901 22:03:26 [Warning] Slave SQL: slave SQL thread is being stopped in the middle of applying of a group having updated a non-transaction table; waiting for the group completion ... , Error_code: 0

... skipped ... (same warning as above/below)

100901 22:04:15 [Warning] Slave SQL: slave SQL thread is being stopped in the middle of applying of a group having updated a non-transaction table; waiting for the group completion ... , Error_code: 0
100901 22:04:17 [Warning] Slave SQL: slave SQL thread is being stopped in the middle of applying of a group having updated a non-transaction table; waiting for the group completion ... , Error_code: 0
100901 22:04:19 [Warning] Slave SQL: slave SQL thread is being stopped in the middle of applying of a group having updated a non-transaction table; waiting for the group completion ... , Error_code: 0
100901 22:04:21 [Warning] Slave SQL: slave SQL thread is being stopped in the middle of applying of a group having updated a non-transaction table; waiting for the group completion ... , Error_code: 0
100901 22:04:23 [Warning] Slave SQL: slave SQL thread is being stopped in the middle of applying of a group having updated a non-transaction table; waiting for the group completion ... , Error_code: 0
100901 22:04:25 [ERROR] Slave SQL: Fatal error: ... The slave SQL is stopped, leaving the current group of events unfinished with a non-transaction table changed. If the group consists solely of Row-based events, you can try restarting the slave with --slave-exec-mode=IDEMPOTENT, which ignores duplicate key, key not found, and similar errors (see documentation for details). Error_code: 1593
100901 22:04:25 [Note] Error reading relay log event: slave SQL thread was killed 
100901 22:04:25 [Warning] Slave SQL: slave SQL thread is being stopped in the middle of applying of a group having updated a non-transaction table; waiting for the group completion ... , Error_code: 0
100901 22:04:25 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 1594
100901 22:04:25 [Warning] Slave SQL: slave SQL thread is being stopped in the middle of applying of a group having updated a non-transaction table; waiting for the group completion ... , Error_code: 0
100901 22:04:25 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'master-bin.000001' position 2046364
100901 22:04:25 [Note] Slave SQL thread initialized, starting replication in log 'master-bin.000001' at position 2046364, relay log './slave-relay-bin.000142' position: 39306
100901 22:04:25 [ERROR] Slave I/O: error connecting to master 'root@127.0.0.1:13000' - retry-time: 1  retries: 10, Error_code: 2003
100901 22:04:26 [Note] Slave I/O thread killed while connecting to master
100901 22:04:26 [Note] Slave I/O thread exiting, read up to log 'master-bin.000001', position 2191653

How to repeat:
Use attached MTR test case. 
It created from NUTS test case and has the idea to stop (if running) or start (if stopped) slave after every statement executed on master.
[10 Oct 2010 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".