Description:
Data Node crashed, and can not startup.
How to repeat:
----------------------------------------------------------------------
Crash: Data Node Log:
node: 3 tab: 2642 frag: 1 no lcp to restore
node: 3 tab: 2706 frag: 0 no lcp to restore
node: 3 tab: 2706 frag: 1 no lcp to restore
node: 3 tab: 2735 frag: 0 no lcp to restore
node: 3 tab: 2735 frag: 1 no lcp to restore
execSTART_RECREQ chaning srnodes from 000000000000000c to 0000000000000008
alloc_chunk(15894 16) -
2014-10-18 11:24:49 [ndbd] INFO -- Start phase 5 completed
2014-10-18 11:24:49 [ndbd] INFO -- Start phase 6 completed
2014-10-18 11:24:49 [ndbd] INFO -- Start phase 7 completed
2014-10-18 11:24:54 [ndbd] INFO -- Start phase 8 completed
2014-10-18 11:24:54 [ndbd] INFO -- Start phase 9 completed
2014-10-18 11:24:54 [ndbd] INFO -- Start phase 100 completed
2014-10-18 11:24:55 [ndbd] INFO -- Suma: handover from node 2 gci: 10199688 buckets: 00000002 (2)
10199688/0 (10199687/4294967295) switchover complete bucket 1 state: 1starting
2014-10-18 11:24:58 [ndbd] INFO -- Start phase 101 completed
2014-10-18 11:24:58 [ndbd] INFO -- Node started
2014-10-18 11:27:47 [ndbd] INFO -- findNeighbours from: 4907 old (left: 2 right: 2) new (65535 65535)
2014-10-18 11:27:47 [ndbd] ALERT -- Network partitioning - arbitration required
2014-10-18 11:27:47 [ndbd] INFO -- President restarts arbitration thread [state=7]
2014-10-18 11:27:47 [ndbd] ALERT -- Arbitration won - positive reply from node 1
start_resend(0, empty bucket (10199852/4 10199852/3) -> active
Finished with handling node-failure
execGCP_NOMORETRANS(10199852/4) c_ongoing_take_over_cnt -> seize
completing gcp 10199852/4 in execTAKE_OVERTCCONF
2014-10-18 11:27:47 [ndbd] INFO -- Started arbitrator node 1 [ticket=2a380005341d6797]
--------------------------------------------------------------
Crash: mgm Node log:
2014-10-18 11:24:48 [MgmtSrvr] INFO -- Node 3: DICT: enable FK 2740 done (466/2735/fk_acr_counpon_account_AcrStoreId)
2014-10-18 11:24:48 [MgmtSrvr] INFO -- Node 3: DICT: enable FK 2742 done (2642/2585/fk_acr_order_detail_AcrOrderId)
2014-10-18 11:24:48 [MgmtSrvr] INFO -- Node 3: DICT: enable FK 2744 done (2642/2594/fk_acr_order_coupon_AcrOrderId)
2014-10-18 11:24:48 [MgmtSrvr] INFO -- Node 3: Start phase 8 completed (initial node restart)
2014-10-18 11:24:48 [MgmtSrvr] INFO -- Node 3: Start phase 9 completed (initial node restart)
2014-10-18 11:24:48 [MgmtSrvr] INFO -- Node 3: Start phase 100 completed (initial node restart)
2014-10-18 11:24:48 [MgmtSrvr] INFO -- Node 3: Node 1: API mysql-5.6.15 ndb-7.3.4
2014-10-18 11:24:49 [MgmtSrvr] INFO -- Node 3: Communication to Node 4 opened
2014-10-18 11:24:49 [MgmtSrvr] INFO -- Node 3: Communication to Node 5 opened
2014-10-18 11:24:49 [MgmtSrvr] INFO -- Node 3: Communication to Node 6 opened
2014-10-18 11:24:49 [MgmtSrvr] INFO -- Node 3: Communication to Node 7 opened
2014-10-18 11:24:49 [MgmtSrvr] INFO -- Node 3: Communication to Node 8 opened
2014-10-18 11:24:49 [MgmtSrvr] INFO -- Node 3: Node 5 Connected
2014-10-18 11:24:49 [MgmtSrvr] INFO -- Node 3: Node 6 Connected
2014-10-18 11:24:50 [MgmtSrvr] INFO -- Node 3: Node 5: API mysql-5.6.15 ndb-7.3.4
2014-10-18 11:24:50 [MgmtSrvr] INFO -- Node 3: Node 6: API mysql-5.6.15 ndb-7.3.4
2014-10-18 11:24:50 [MgmtSrvr] INFO -- Node 3: Suma: initiate handover for startup with nodes 0000000000000004 GCI: 10199688
2014-10-18 11:24:50 [MgmtSrvr] INFO -- Node 3: Suma: handover from node 2 gci: 10199688 buckets: 00000002 (2)
2014-10-18 11:24:52 [MgmtSrvr] INFO -- Node 3: Start phase 101 completed (initial node restart)
2014-10-18 11:24:52 [MgmtSrvr] INFO -- Node 3: Started (mysql-5.6.15 ndb-7.3.4)
2014-10-18 11:24:52 [MgmtSrvr] INFO -- Node 2: DICT: unlocked by node 3 for NodeRestart
2014-10-18 11:27:41 [MgmtSrvr] ALERT -- Node 3: Node 2 Disconnected
2014-10-18 11:27:41 [MgmtSrvr] INFO -- Node 3: Communication to Node 2 closed
2014-10-18 11:27:41 [MgmtSrvr] ALERT -- Node 3: Network partitioning - arbitration required
2014-10-18 11:27:41 [MgmtSrvr] INFO -- Node 3: President restarts arbitration thread [state=7]
2014-10-18 11:27:42 [MgmtSrvr] ALERT -- Node 2: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
2014-10-18 11:27:42 [MgmtSrvr] ALERT -- Node 1: Node 2 Disconnected
2014-10-18 11:27:42 [MgmtSrvr] ALERT -- Node 3: Arbitration won - positive reply from node 1
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: GCP Take over started
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: Node 3 taking over as DICT master
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: GCP Monitor: unlimited lags allowed
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: GCP Take over completed
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: kk: 10199852/4 0 0
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: LCP Take over started
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: ParticipatingDIH = 0000000000000000
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: ParticipatingLQH = 0000000000000000
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=0 0000000000000000]
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=0 0000000000000000]
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_From_Master_Received = 1
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: LCP Take over completed (state = 4)
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: ParticipatingDIH = 0000000000000000
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: ParticipatingLQH = 0000000000000000
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=0 0000000000000000]
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=0 0000000000000000]
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_From_Master_Received = 1
2014-10-18 11:27:42 [MgmtSrvr] INFO -- Node 3: Started arbitrator node 1 [ticket=2a380005341d6797]
2014-10-18 11:28:40 [MgmtSrvr] WARNING -- Node 3: Failure handling of node 2 has not completed in 1 min - state = 6
2014-10-18 11:29:41 [MgmtSrvr] WARNING -- Node 3: Failure handling of node 2 has not completed in 2 min - state = 6
2014-10-18 11:30:11 [MgmtSrvr] INFO -- Node 3: Communication to Node 2 opened
----------------------------------------------------------------
start error: data node:
[root@centos01 ~]# ndbd
2014-10-18 11:43:49 [ndbd] INFO -- Angel connected to '192.168.1.10:1186'
2014-10-18 11:43:49 [ndbd] INFO -- Angel allocated nodeid: 2
[root@centos01 ~]# tail -f /var/lib/mysql-cluster/ndb_1_cluster.log
2014-10-18 11:43:51 [MgmtSrvr] INFO -- Node 3: Node 2 Connected
2014-10-18 11:43:51 [MgmtSrvr] INFO -- Node 2: CM_REGCONF president = 3, own Node = 2, our dynamic id = 0/6
2014-10-18 11:43:51 [MgmtSrvr] INFO -- Node 3: Node 2: API mysql-5.6.15 ndb-7.3.4
2014-10-18 11:43:51 [MgmtSrvr] INFO -- Node 2: Node 3: API mysql-5.6.15 ndb-7.3.4
2014-10-18 11:43:51 [MgmtSrvr] INFO -- Node 2: Start phase 1 completed
2014-10-18 11:43:52 [MgmtSrvr] INFO -- Node 2: Start phase 2 completed (node restart)
2014-10-18 11:43:52 [MgmtSrvr] INFO -- Node 3: DICT: locked by node 2 for NodeRestart
2014-10-18 11:43:52 [MgmtSrvr] INFO -- Node 2: Start phase 3 completed (node restart)
2014-10-18 11:43:52 [MgmtSrvr] INFO -- Node 2: Start phase 4 completed (node restart)
2014-10-18 11:43:52 [MgmtSrvr] INFO -- Node 2: Receive arbitrator node 1 [ticket=2a380005341d6797]
2014-10-18 11:45:59 [MgmtSrvr] INFO -- Node 2: Starting to restore schema
2014-10-18 11:46:05 [MgmtSrvr] ALERT -- Node 2: Forced node shutdown completed. Occured during startphase 5. Initiated by signal 11. Caused by error 6000: 'Error OS signal received(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
2014-10-18 11:46:05 [MgmtSrvr] ALERT -- Node 3: Node 2 Disconnected
2014-10-18 11:46:05 [MgmtSrvr] INFO -- Node 3: Communication to Node 2 closed
2014-10-18 11:46:05 [MgmtSrvr] ALERT -- Node 3: Network partitioning - arbitration required
2014-10-18 11:46:05 [MgmtSrvr] INFO -- Node 3: President restarts arbitration thread [state=7]
2014-10-18 11:46:05 [MgmtSrvr] ALERT -- Node 1: Node 2 Disconnected
2014-10-18 11:46:05 [MgmtSrvr] ALERT -- Node 3: Arbitration won - positive reply from node 1
2014-10-18 11:46:05 [MgmtSrvr] INFO -- Node 3: Removed lock for node 2
2014-10-18 11:46:05 [MgmtSrvr] INFO -- Node 3: DICT: remove lock by failed node 2 for NodeRestart
2014-10-18 11:46:05 [MgmtSrvr] INFO -- Node 3: DICT: unlocked by node 2 for NodeRestart
2014-10-18 11:46:05 [MgmtSrvr] INFO -- Node 3: Started arbitrator node 1 [ticket=2a380006342e3c9d]
2014-10-18 11:46:08 [MgmtSrvr] INFO -- Node 3: Communication to Node 2 opened
--------------------------
[root@centos01 ~]# tail -n 50 /var/lib/mysql-cluster/ndb_2_error.log
Pid: 6557
Version: mysql-5.6.15 ndb-7.3.4
Trace: /var/lib/mysql-cluster/ndb_2_trace.log.7 [t1..t1]
***EOM*
Time: Wednesday 30 July 2014 - 16:31:56
Status: Temporary error, restart node
Message: Error OS signal received (Internal error, programming error or missing error message, please report a bug)
Error: 6000
Error data: Signal 11 received; Segmentation fault
Error object: /export/home/pb2/build/sb_0-11331885-1390486620.31/mysql-cluster-gpl-7.3.4/storage/ndb/src/kernel/ndbd.cpp
Program: ndbd
Pid: 6655
Version: mysql-5.6.15 ndb-7.3.4
Trace: /var/lib/mysql-cluster/ndb_2_trace.log.8 [t1..t1]
***EOM*
Time: Monday 29 September 2014 - 17:31:28
Status: Temporary error, restart node
Message: Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s) (Arbitration error)
Error: 2305
Error data: Arbitrator decided to shutdown this node
Error object: QMGR (Line: 6068) 0x00000002
Program: ndbd
Pid: 1408
Version: mysql-5.6.15 ndb-7.3.4
Trace: /var/lib/mysql-cluster/ndb_2_trace.log.9 [t1..t1]
***EOM***
Time: Saturday 18 October 2014 - 11:27:41
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbtcMain.cpp
Error object: DBTC (Line: 18290) 0x00000002
Program: ndbd
Pid: 1413
Version: mysql-5.6.15 ndb-7.3.4
Trace: /var/lib/mysql-cluster/ndb_2_trace.log.10 [t1..t1]
***EOM***
Time: Saturday 18 October 2014 - 11:46:04
Status: Temporary error, restart node
Message: Error OS signal received (Internal error, programming error or missing error message, please report a bug)
Error: 6000
Error data: Signal 11 received; Segmentation fault
Error object: /export/home/pb2/build/sb_0-11331885-1390486620.31/mysql-cluster-gpl-7.3.4/storage/ndb/src/kernel/ndbd.cpp
Program: ndbd
Pid: 10972
Version: mysql-5.6.15 ndb-7.3.4
Trace: /var/lib/mysql-cluster/ndb_2_trace.log.11 [t1..t1]
***