Bug #25613 Replication is interrupted if one data node (R=2) was stopped on slave cluster
Submitted: 14 Jan 2007 14:35 Modified: 26 Feb 2007 17:08
Reporter: Serge Kozlov Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Replication Severity:S2 (Serious)
Version:5.1.15-bk OS:Linux (Linux FC4)
Assigned to: Serge Kozlov CPU Architecture:Any

[14 Jan 2007 14:35] Serge Kozlov
Description:
Slave cluster has two data nodes and two replicas (one node = one replica). One data node was stopped before starting of replication, but cluster can work properly (because of it has two replicas). Then replication started on slave and external script create a few ndb_dd tables and insert millions of rows.

Sometimes during of replication the following errors appear on slave:

070114 15:03:50 [Note] Slave SQL thread initialized, starting replication in log
 'FIRST' at position 0, relay log './ndb17-relay-bin.000001' position: 4
070114 15:03:50 [Note] Slave I/O thread: connected to master 'repl@ndb16:3306',r
eplication started in log 'FIRST' at position 4
070114 15:11:05 [ERROR] Slave: Error in Write_rows event: error during transacti
on execution on table test.t2, Error_code: 410
070114 15:11:05 [ERROR] Slave: Error in Write_rows event: error during transacti
on execution on table test.t2, Error_code: 410
070114 15:11:06 [ERROR] Slave: Error in Write_rows event: error during transacti
on execution on table test.t2, Error_code: 410
070114 15:11:08 [ERROR] Slave: Error in Write_rows event: error during transacti
on execution on table test.t2, Error_code: 410
070114 15:11:11 [ERROR] Slave: Error in Write_rows event: error during transacti
on execution on table test.t2, Error_code: 410
070114 15:11:15 [ERROR] Slave: Error in Write_rows event: error during transacti
on execution on table test.t2, Error_code: 410
070114 15:11:20 [ERROR] Slave: Error in Write_rows event: error during transacti
on execution on table test.t2, Error_code: 410
070114 15:11:25 [ERROR] Slave: Error in Write_rows event: error during transacti
on execution on table test.t2, Error_code: 410
070114 15:11:30 [ERROR] Slave: Error in Write_rows event: error during transacti
on execution on table test.t2, Error_code: 410
070114 15:11:35 [ERROR] Slave: Error in Write_rows event: error during transacti
on execution on table test.t2, Error_code: 410
070114 15:11:40 [ERROR] Slave: Error in Write_rows event: error during transacti
on execution on table test.t2, Error_code: 410
070114 15:11:40 [ERROR] Slave SQL thread retried transaction 10 time(s) in vain,
 giving up. Consider raising the value of the slave_transaction_retries variable
.
070114 15:11:40 [ERROR] Slave (additional info): Unknown error Error_code: 1105
070114 15:11:40 [Warning] Slave: Got temporary error 410 'REDO log files overloa
ded, consult online manual (decrease TimeBetweenLocalCheckpoints, and|or incre'
from NDB Error_code: 1297
070114 15:11:40 [Warning] Slave: Unknown error Error_code: 1105
070114 15:11:40 [ERROR] Error running query, slave SQL thread aborted. Fix the p
roblem, and restart the slave SQL thread with "SLAVE START". We stopped at log '
mysql-bin.000001' position 44071145

How to repeat:
1. Start master cluster.
2. Start slave cluster.
3. Stop one data node on slave.
4. Start replication.
5. Run external script from attached file:
./sqe.pl -q aa.txt -p=ndb16:3306:root::test
6. Observe the error log of mysqld server on slave.