Bug #25650 Slave is stopped because of errors if one data node is restarted on master
Submitted: 16 Jan 2007 14:52 Modified: 23 Jan 2007 7:10
Reporter: Serge Kozlov Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Replication Severity:S2 (Serious)
Version:5.1.15-bk OS:
Assigned to: CPU Architecture:Any

[16 Jan 2007 14:52] Serge Kozlov
Description:
Description:
Slave/master clusters have two data nodes and two replicas (one node = one replica). External perl script launched which put many rows and then changes table structure in loop. During execution that script one data node (non-master) is stopped and then started again. 

Following  errors appears on slave (mysqld error log) and then slave stopped:

070116 15:26:52 [Note] Slave SQL thread initialized, starting replication in log
 'FIRST' at position 0, relay log './ndb17-relay-bin.000001' position: 4
070116 15:26:52 [Note] Slave I/O thread: connected to master 'repl@ndb16:3306',r
eplication started in log 'FIRST' at position 4
070116 15:30:58 [ERROR] Slave: Error in Delete_rows event: row application faile
d, Error_code: 233
070116 15:30:58 [ERROR] Slave: Error in Delete_rows event: error during transact
ion execution on table test.t2, Error_code: 233
070116 15:31:19 [ERROR] Slave: Error in Delete_rows event: row application faile
d, Error_code: 233
070116 15:31:19 [ERROR] Slave: Error in Delete_rows event: error during transact
ion execution on table test.t2, Error_code: 233
070116 15:31:43 [ERROR] Slave: Error in Delete_rows event: row application faile
d, Error_code: 233
070116 15:31:43 [ERROR] Slave: Error in Delete_rows event: error during transact
ion execution on table test.t2, Error_code: 233
070116 15:32:08 [ERROR] Slave: Error in Delete_rows event: row application faile
d, Error_code: 233
070116 15:32:08 [ERROR] Slave: Error in Delete_rows event: error during transact
ion execution on table test.t2, Error_code: 233
070116 15:32:33 [ERROR] Slave: Error in Delete_rows event: row application faile
d, Error_code: 233
070116 15:32:33 [ERROR] Slave: Error in Delete_rows event: error during transact
ion execution on table test.t2, Error_code: 233
070116 15:32:59 [ERROR] Slave: Error in Delete_rows event: row application faile
d, Error_code: 233
070116 15:32:59 [ERROR] Slave: Error in Delete_rows event: error during transact
ion execution on table test.t2, Error_code: 233
070116 15:33:26 [ERROR] Slave: Error in Delete_rows event: row application faile
d, Error_code: 233
070116 15:33:26 [ERROR] Slave: Error in Delete_rows event: error during transact
ion execution on table test.t2, Error_code: 233
070116 15:33:54 [ERROR] Slave: Error in Delete_rows event: row application faile
d, Error_code: 233
070116 15:33:54 [ERROR] Slave: Error in Delete_rows event: error during transact
ion execution on table test.t2, Error_code: 233
070116 15:34:21 [ERROR] Slave: Error in Delete_rows event: row application faile
d, Error_code: 233
070116 15:34:21 [ERROR] Slave: Error in Delete_rows event: error during transact
ion execution on table test.t2, Error_code: 233
070116 15:34:48 [ERROR] Slave: Error in Delete_rows event: row application faile
d, Error_code: 233
070116 15:34:48 [ERROR] Slave: Error in Delete_rows event: error during transact
ion execution on table test.t2, Error_code: 233
070116 15:35:15 [ERROR] Slave: Error in Delete_rows event: row application faile
d, Error_code: 233
070116 15:35:15 [ERROR] Slave: Error in Delete_rows event: error during transact
ion execution on table test.t2, Error_code: 233
070116 15:35:15 [ERROR] Slave SQL thread retried transaction 10 time(s) in vain,
 giving up. Consider raising the value of the slave_transaction_retries variable
.
070116 15:35:15 [ERROR] Slave (additional info): Unknown error Error_code: 1105
070116 15:35:15 [Warning] Slave: Unknown error Error_code: 1105
070116 15:35:15 [ERROR] Error running query, slave SQL thread aborted. Fix the p
roblem, and restart the slave SQL thread with "SLAVE START". We stopped at log '
mysql-bin.000001' position 8978607

How to repeat:
1. Both cluster have same config (see attached file)
2. Start clusters
3. Configure and start replication between clusters.
4. Run external script:
./sqe.pl -q aa6.ttx -p -p=127.0.0.1:3306:root::test
5. Wait while the script will change table t2 (add column/copy rows from t1/drop column/delete rows)
6. Stop data node on master cluster
7. Start data node on master cluster
8. Look at error.log on slave mysqld
[16 Jan 2007 14:58] Serge Kozlov
trace, log files, config.ini, perl script

Attachment: bug25650.tar.gz (application/x-gzip, text), 19.17 KiB.