Description:
Testing cluster replication. The slave cluster ran out of disk space causing the the cluster to crash.
Stack trace from MySQLD:
0x816cee8 handle_segfault + 392
0x4005e5cd _end + 934600377
0x837b96c
_ZN14NdbTransaction15getNdbOperationEPK12NdbTableImplP12NdbOperation + 12 0x837ba4b _ZN14NdbTransaction15getNdbOperationEPKN13NdbDictionary5TableE
+ 27
0x82157bb _ZN13ha_ndbcluster9write_rowEPc + 211
0x82047c5 _ZN7handler12ha_write_rowEPc + 25 0x81dee4f _ZN14Rows_log_event10exec_eventEP17st_relay_log_info + 627 0x824ac8a _Z20exec_relay_log_eventP3THDP17st_relay_log_info + 578
0x8248e33 handle_slave_sql + 1015
0x400586de _end + 934576074
0x401d86c7 _end + 936148915
From the cluster error log:
2005-05-18 08:48:36 [MgmSrvr] ALERT -- Node 3: Node 2 Disconnected
2005-05-18 08:48:36 [MgmSrvr] ALERT -- Node 5: Node 2 Disconnected
2005-05-18 08:48:36 [MgmSrvr] ALERT -- Node 4: Node 2 Disconnected
2005-05-18 08:48:36 [MgmSrvr] INFO -- Node 3: Communication to Node
2 closed
2005-05-18 08:48:36 [MgmSrvr] INFO -- Node 4: Communication to Node
2 closed
2005-05-18 08:48:36 [MgmSrvr] INFO -- Node 5: Communication to Node
2 closed
2005-05-18 08:48:36 [MgmSrvr] ALERT -- Node 1: Node 2 Disconnected
2005-05-18 08:48:37 [MgmSrvr] ALERT -- Node 5: Node 4 Disconnected
2005-05-18 08:48:37 [MgmSrvr] ALERT -- Node 3: Node 4 Disconnected
2005-05-18 08:48:37 [MgmSrvr] INFO -- Node 3: Possible bug in
Dbdih::execBLOCK_COMMIT_ORD c_blockCommit = 1 c_blockCommitNo = 2
sig->failNo =
2005-05-18 08:48:37 [MgmSrvr] INFO -- Node 3: Communication to Node
2 closed
2005-05-18 08:48:37 [MgmSrvr] INFO -- Node 3: Communication to Node
4 closed
2005-05-18 08:48:37 [MgmSrvr] INFO -- Node 5: Possible bug in
Dbdih::execBLOCK_COMMIT_ORD c_blockCommit = 1 c_blockCommitNo = 2
sig->failNo =
2005-05-18 08:48:37 [MgmSrvr] INFO -- Node 5: Communication to Node
2 closed
2005-05-18 08:48:37 [MgmSrvr] INFO -- Node 5: Communication to Node
4 closed
2005-05-18 08:48:37 [MgmSrvr] ALERT -- Node 1: Node 4 Disconnected
2005-05-18 08:48:38 [MgmSrvr] ALERT -- Node 1: Node 3 Disconnected
2005-05-18 08:48:39 [MgmSrvr] ALERT -- Node 1: Node 5 Disconnected
How to repeat:
Setup two clusters with one that replicates to the other. Using the bank test, run the slave out of disk space.
Suggested fix:
Cluster should remain up. Cluster should abort or rollback any non commited transaction and refuse to do any transactions until the disk space issue is corrected.