API node error log:

2014-03-23 05:49:00 19322 [Note] NDB Binlog: Node: 10, down, Subscriber bitmask 00
2014-03-23 05:49:00 19322 [Note] NDB Binlog: cluster failure for ./mysql/ndb_schema at epoch 346797/0.
2014-03-23 05:49:00 19322 [Note] NDB Binlog: ndb tables initially read only on reconnect.
2014-03-23 05:49:03 19322 [Note] NDB Binlog: cluster failure for ./mysql/ndb_apply_status at epoch 346797/0.
2014-03-23 05:49:03 19322 [Note] Restarting Cluster Binlog
2014-03-23 05:49:12 19322 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$mysql/ndb_schema
2014-03-23 05:49:12 19322 [Note] NDB Binlog: logging ./mysql/ndb_schema (UPDATED,USE_WRITE)
2014-03-23 05:49:12 19322 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$mysql/ndb_apply_status
2014-03-23 05:49:12 19322 [Note] NDB Binlog: logging ./mysql/ndb_apply_status (UPDATED,USE_WRITE)
2014-03-23 05:49:13 19322 [Note] NDB: Cleaning stray tables from database 'information_schema'
2014-03-23 05:49:13 19322 [Note] NDB: Cleaning stray tables from database 'dov'
2014-03-23 05:49:13 19322 [Note] NDB: Cleaning stray tables from database 'ndb_10_fs'
2014-03-23 05:49:13 19322 [Note] NDB: Cleaning stray tables from database 'ndbinfo'
2014-03-23 05:49:13 19322 [Note] NDB: Cleaning stray tables from database 'performance_schema'
2014-03-23 05:49:13 19322 [Note] NDB: Cleaning stray tables from database 'test'
2014-03-23 05:49:13 19322 [Note] NDB: Cleaning stray tables from database 'world'
2014-03-23 05:49:13 19322 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$mysql/ndb_index_stat_sample
2014-03-23 05:49:13 19322 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$world/CountryLanguage
2014-03-23 05:49:13 19322 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$mysql/ndb_index_stat_head
2014-03-23 05:49:13 19322 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$world/City
2014-03-23 05:49:13 19322 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$world/Country
2014-03-23 05:49:13 [NdbApi] INFO     -- Flushing incomplete GCI:s < 346801/0
2014-03-23 05:49:13 [NdbApi] INFO     -- Flushing incomplete GCI:s < 346801/0
2014-03-23 05:49:13 19322 [Note] NDB Binlog: starting log at epoch 346801/0
2014-03-23 05:49:13 19322 [Note] NDB Binlog: ndb tables writable
2014-03-23 05:49:13 19322 [Note] NDB Binlog: Node: 10, subscribe from node 200, Subscriber bitmask 00
2014-03-23 05:56:06 19322 [Note] NDB Binlog: Node: 10, unsubscribe from node 200, Subscriber bitmask 00
2014-03-23 05:56:11 19322 [Note] NDB Binlog: Node: 10, subscribe from node 200, Subscriber bitmask 00
2014-03-23 07:36:10 19322 [Note] NDB Binlog: Node: 10, down, Subscriber bitmask 00
2014-03-23 07:36:13 19322 [Note] NDB Binlog: cluster failure for ./mysql/ndb_schema at epoch 349789/0.
2014-03-23 07:36:13 19322 [Note] NDB Binlog: ndb tables initially read only on reconnect.
2014-03-23 07:36:28 19322 [Note] NDB Binlog: cluster failure for ./mysql/ndb_apply_status at epoch 349789/0.
2014-03-23 07:36:30 19322 [Note] Restarting Cluster Binlog


NDB node error log:

2014-03-23 03:45:59 [MgmtSrvr] INFO     -- Node 10: Local checkpoint 194 started. Keep GCI = 341711 oldest restorable GCI = 248433
2014-03-23 03:46:03 [MgmtSrvr] INFO     -- Node 10: Local checkpoint 194 completed
2014-03-23 04:41:25 [MgmtSrvr] INFO     -- Node 10: Local checkpoint 195 started. Keep GCI = 343352 oldest restorable GCI = 248433
2014-03-23 04:41:29 [MgmtSrvr] INFO     -- Node 10: Local checkpoint 195 completed
2014-03-23 04:48:30 [MgmtSrvr] WARNING  -- Node 10: Node 100 missed heartbeat 2
2014-03-23 04:50:16 [MgmtSrvr] WARNING  -- Node 10: Node 1 missed heartbeat 2
2014-03-23 05:38:02 [MgmtSrvr] INFO     -- Node 10: Local checkpoint 196 started. Keep GCI = 344992 oldest restorable GCI = 248433
2014-03-23 05:38:06 [MgmtSrvr] INFO     -- Node 10: Local checkpoint 196 completed
2014-03-23 05:48:54 [MgmtSrvr] ALERT    -- Node 1: Node 3 Disconnected
2014-03-23 05:49:00 [MgmtSrvr] ALERT    -- Node 1: Node 10 Disconnected
2014-03-23 05:49:02 [MgmtSrvr] INFO     -- Node 1: Node 3 Connected
2014-03-23 05:49:08 [MgmtSrvr] INFO     -- Node 1: Node 10 Connected
2014-03-23 05:49:09 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 2 seconds (no max lag)
2014-03-23 05:49:09 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 12 seconds (no max lag)
2014-03-23 05:49:09 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 22 seconds (no max lag)
2014-03-23 05:49:09 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_SAVE lag 180 seconds (no max lag)
2014-03-23 05:49:09 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 32 seconds (no max lag)
2014-03-23 05:49:09 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 42 seconds (no max lag)
2014-03-23 05:49:09 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 52 seconds (no max lag)
2014-03-23 05:49:09 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 62 seconds (no max lag)
2014-03-23 05:49:09 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 72 seconds (no max lag)
2014-03-23 05:49:09 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 82 seconds (no max lag)
2014-03-23 05:49:10 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 3 seconds (no max lag)
2014-03-23 05:49:10 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 13 seconds (no max lag)
2014-03-23 05:49:10 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 23 seconds (no max lag)
2014-03-23 05:49:10 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 33 seconds (no max lag)
2014-03-23 05:49:10 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 4 seconds (no max lag)
2014-03-23 05:49:10 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_SAVE lag 60 seconds (no max lag)
2014-03-23 05:49:10 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 14 seconds (no max lag)
2014-03-23 05:49:10 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 6 seconds (no max lag)
2014-03-23 05:49:10 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 16 seconds (no max lag)
2014-03-23 05:56:06 [MgmtSrvr] WARNING  -- Node 10: GCP Monitor: GCP_COMMIT lag 8 seconds (no max lag)
2014-03-23 05:56:06 [MgmtSrvr] ALERT    -- Node 10: Node 3 Disconnected
2014-03-23 05:56:06 [MgmtSrvr] INFO     -- Node 10: Lost arbitrator node 3 - process failure [state=6]
2014-03-23 05:56:06 [MgmtSrvr] INFO     -- Node 10: President restarts arbitration thread [state=1]
2014-03-23 05:56:06 [MgmtSrvr] INFO     -- Node 10: Communication to Node 3 closed
2014-03-23 05:56:06 [MgmtSrvr] ALERT    -- Node 10: Node 200 Disconnected
2014-03-23 05:56:06 [MgmtSrvr] INFO     -- Node 10: Communication to Node 200 closed
2014-03-23 05:56:06 [MgmtSrvr] ALERT    -- Node 1: Node 3 Disconnected
2014-03-23 05:56:06 [MgmtSrvr] INFO     -- Node 1: Node 3 Connected
2014-03-23 05:56:06 [MgmtSrvr] INFO     -- Node 10: Started arbitrator node 1 [ticket=341e000f30cc5b30]
2014-03-23 05:56:07 [MgmtSrvr] INFO     -- Node 10: Communication to Node 3 opened
2014-03-23 05:56:07 [MgmtSrvr] INFO     -- Node 10: Node 3 Connected
2014-03-23 05:56:07 [MgmtSrvr] INFO     -- Node 10: Node 3: API mysql-5.6.14 ndb-7.3.3
2014-03-23 05:56:10 [MgmtSrvr] INFO     -- Node 10: Communication to Node 200 opened
2014-03-23 05:56:10 [MgmtSrvr] INFO     -- Node 10: Node 200 Connected
2014-03-23 05:56:10 [MgmtSrvr] INFO     -- Node 10: Node 200: API mysql-5.6.14 ndb-7.3.3
2014-03-23 05:56:25 [MgmtSrvr] ALERT    -- Node 10: Node 3 Disconnected
2014-03-23 05:56:25 [MgmtSrvr] INFO     -- Node 10: Communication to Node 3 closed
2014-03-23 05:56:26 [MgmtSrvr] INFO     -- Node 10: Communication to Node 3 opened
2014-03-23 05:56:26 [MgmtSrvr] INFO     -- Node 10: Node 3 Connected
2014-03-23 05:56:26 [MgmtSrvr] INFO     -- Node 10: Node 3: API mysql-5.6.14 ndb-7.3.3
2014-03-23 06:35:22 [MgmtSrvr] INFO     -- Node 10: Local checkpoint 197 started. Keep GCI = 346626 oldest restorable GCI = 248433
2014-03-23 06:35:27 [MgmtSrvr] INFO     -- Node 10: Local checkpoint 197 completed
2014-03-23 07:36:27 [MgmtSrvr] ALERT    -- Node 1: Node 10 Disconnected
2014-03-23 07:36:28 [MgmtSrvr] ALERT    -- Node 1: Node 3 Disconnected
2014-03-23 07:38:08 [MgmtSrvr] ALERT    -- Node 10: Forced node shutdown completed. Caused by error 6050: 'WatchDog terminate, internal error or massive overload on the machine running this node(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.


This was consistent across 3 MGM nodes, 2 DATA, and 2 API nodes.

[ndbd default]
# Options affecting ndbd processes on all data nodes:
NoOfReplicas=2    # Number of replicas
DataMemory=80M    # How much memory to allocate for data storage
IndexMemory=18M   # How much memory to allocate for index storage
                  # For DataMemory and IndexMemory, we have used the
                  # default values. Since the "world" database takes up
                  # only about 500KB, this should be more than enough for
                  # this example Cluster setup.
LockPagesInMainMemory=1
ODirect=1


[tcp default]
# TCP/IP options:
portnumber=2202   # This the default; however, you can use any
                  # port that is free for all the hosts in the cluster
                  # Note: It is recommended that you do not specify the port
                  # number at all and simply allow the default value to be used
                  # instead


[ndb_mgmd]
## Cluster MANAGEMENT NODE NASHVILLE
NodeId=1
hostname=nashville.server.com
datadir=/var/lib/mysql-cluster

[ndb_mgmd]
## Cluster MANAGEMENT NODE  ATLANTA
NodeId=2
hostname=atlanta.server.com
datadir=/var/lib/mysql-cluster

[ndb_mgmd]
## Cluster MANAGEMENT NODE CHARLOTTE
NodeId=3
hostname=charlotte.server.com
datadir=/var/lib/mysql-cluster

[ndbd]
## Cluster DATA NODE 1 - NASHVILLE
NodeId=10
hostname=nashville.server.com
datadir=/var/lib/mysql/data

[ndbd]
## Cluster DATA NODE 2 - ATLANTA
NodeId=20
hostname=atlanta.server.com
datadir=/var/lib/mysql/data

[mysqld]
## Cluster SQL NODE 1 NASHVILLE
NodeId=100
hostname=nashville.server.com
[mysqld]
## Cluster SQL NODE 2 - ATLANTA
NodeId=200
hostname=atlanta.server.com
#[mysqld]         
## Cluster SQL NODE CHARLOTTE
#NodeId=255       
#hostname=charlotte.server.com