MySQL Bugs: #10709: Cluster crashed if disks are out of space

Bug #10709	Cluster crashed if disks are out of space
Submitted:	18 May 2005 14:09	Modified:	20 Jul 2005 6:35
Reporter:	Jonathan Miller	Email Updates:
Status:	Won't fix	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	4.1,5.0,5.1-Alpha	OS:	Linux (Linux)
Assigned to:	Assigned Account	CPU Architecture:	Any

Description:
Testing cluster replication. The slave cluster ran out of disk space causing the the cluster to crash.

Stack trace from MySQLD:
0x816cee8 handle_segfault + 392
0x4005e5cd _end + 934600377
0x837b96c
_ZN14NdbTransaction15getNdbOperationEPK12NdbTableImplP12NdbOperation + 12 0x837ba4b _ZN14NdbTransaction15getNdbOperationEPKN13NdbDictionary5TableE
+ 27
0x82157bb _ZN13ha_ndbcluster9write_rowEPc + 211
0x82047c5 _ZN7handler12ha_write_rowEPc + 25 0x81dee4f _ZN14Rows_log_event10exec_eventEP17st_relay_log_info + 627 0x824ac8a _Z20exec_relay_log_eventP3THDP17st_relay_log_info + 578
0x8248e33 handle_slave_sql + 1015
0x400586de _end + 934576074
0x401d86c7 _end + 936148915

From the cluster error log:
2005-05-18 08:48:36 [MgmSrvr] ALERT    -- Node 3: Node 2 Disconnected
2005-05-18 08:48:36 [MgmSrvr] ALERT    -- Node 5: Node 2 Disconnected
2005-05-18 08:48:36 [MgmSrvr] ALERT    -- Node 4: Node 2 Disconnected
2005-05-18 08:48:36 [MgmSrvr] INFO     -- Node 3: Communication to Node
2 closed
2005-05-18 08:48:36 [MgmSrvr] INFO     -- Node 4: Communication to Node
2 closed
2005-05-18 08:48:36 [MgmSrvr] INFO     -- Node 5: Communication to Node
2 closed
2005-05-18 08:48:36 [MgmSrvr] ALERT    -- Node 1: Node 2 Disconnected
2005-05-18 08:48:37 [MgmSrvr] ALERT    -- Node 5: Node 4 Disconnected
2005-05-18 08:48:37 [MgmSrvr] ALERT    -- Node 3: Node 4 Disconnected
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 3: Possible bug in
Dbdih::execBLOCK_COMMIT_ORD c_blockCommit = 1 c_blockCommitNo = 2
sig->failNo =
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 3: Communication to Node
2 closed
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 3: Communication to Node
4 closed
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 5: Possible bug in
Dbdih::execBLOCK_COMMIT_ORD c_blockCommit = 1 c_blockCommitNo = 2
sig->failNo =
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 5: Communication to Node
2 closed
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 5: Communication to Node
4 closed
2005-05-18 08:48:37 [MgmSrvr] ALERT    -- Node 1: Node 4 Disconnected
2005-05-18 08:48:38 [MgmSrvr] ALERT    -- Node 1: Node 3 Disconnected
2005-05-18 08:48:39 [MgmSrvr] ALERT    -- Node 1: Node 5 Disconnected
                                                                       
     

How to repeat:
Setup two clusters with one that replicates to the other. Using the bank test, run the slave out of disk space.

Suggested fix:
Cluster should remain up. Cluster should abort or rollback any non commited transaction and refuse to do any transactions until the disk space issue is corrected.

I was able to reproduce this using 5.0.3 & FC2

Another bug report had been open on this.
MySQL Bugs: #11130: crash of datanodes on disk full

This bug is not scheduled to be fixed at this time.