Bug #10709 Cluster crashed if disks are out of space
Submitted: 18 May 2005 14:09 Modified: 20 Jul 2005 6:35
Reporter: Jonathan Miller Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:4.1,5.0,5.1-Alpha OS:Linux (Linux)
Assigned to: Assigned Account CPU Architecture:Any

[18 May 2005 14:09] Jonathan Miller
Description:
Testing cluster replication. The slave cluster ran out of disk space causing the the cluster to crash.

Stack trace from MySQLD:
0x816cee8 handle_segfault + 392
0x4005e5cd _end + 934600377
0x837b96c
_ZN14NdbTransaction15getNdbOperationEPK12NdbTableImplP12NdbOperation + 12 0x837ba4b _ZN14NdbTransaction15getNdbOperationEPKN13NdbDictionary5TableE
+ 27
0x82157bb _ZN13ha_ndbcluster9write_rowEPc + 211
0x82047c5 _ZN7handler12ha_write_rowEPc + 25 0x81dee4f _ZN14Rows_log_event10exec_eventEP17st_relay_log_info + 627 0x824ac8a _Z20exec_relay_log_eventP3THDP17st_relay_log_info + 578
0x8248e33 handle_slave_sql + 1015
0x400586de _end + 934576074
0x401d86c7 _end + 936148915

From the cluster error log:
2005-05-18 08:48:36 [MgmSrvr] ALERT    -- Node 3: Node 2 Disconnected
2005-05-18 08:48:36 [MgmSrvr] ALERT    -- Node 5: Node 2 Disconnected
2005-05-18 08:48:36 [MgmSrvr] ALERT    -- Node 4: Node 2 Disconnected
2005-05-18 08:48:36 [MgmSrvr] INFO     -- Node 3: Communication to Node
2 closed
2005-05-18 08:48:36 [MgmSrvr] INFO     -- Node 4: Communication to Node
2 closed
2005-05-18 08:48:36 [MgmSrvr] INFO     -- Node 5: Communication to Node
2 closed
2005-05-18 08:48:36 [MgmSrvr] ALERT    -- Node 1: Node 2 Disconnected
2005-05-18 08:48:37 [MgmSrvr] ALERT    -- Node 5: Node 4 Disconnected
2005-05-18 08:48:37 [MgmSrvr] ALERT    -- Node 3: Node 4 Disconnected
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 3: Possible bug in
Dbdih::execBLOCK_COMMIT_ORD c_blockCommit = 1 c_blockCommitNo = 2
sig->failNo =
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 3: Communication to Node
2 closed
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 3: Communication to Node
4 closed
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 5: Possible bug in
Dbdih::execBLOCK_COMMIT_ORD c_blockCommit = 1 c_blockCommitNo = 2
sig->failNo =
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 5: Communication to Node
2 closed
2005-05-18 08:48:37 [MgmSrvr] INFO     -- Node 5: Communication to Node
4 closed
2005-05-18 08:48:37 [MgmSrvr] ALERT    -- Node 1: Node 4 Disconnected
2005-05-18 08:48:38 [MgmSrvr] ALERT    -- Node 1: Node 3 Disconnected
2005-05-18 08:48:39 [MgmSrvr] ALERT    -- Node 1: Node 5 Disconnected
                                                                       
     

How to repeat:
Setup two clusters with one that replicates to the other. Using the bank test, run the slave out of disk space.

Suggested fix:
Cluster should remain up. Cluster should abort or rollback any non commited transaction and refuse to do any transactions until the disk space issue is corrected.
[2 Jun 2005 15:24] Jorge del Conde
I was able to reproduce this using 5.0.3 & FC2
[8 Jun 2005 12:33] Jonathan Miller
Another bug report had been open on this.
MySQL Bugs: #11130: crash of datanodes on disk full
[13 Mar 2014 13:33] Omer Barnir
This bug is not scheduled to be fixed at this time.