Description:
I shutdown cluster to restart for new set of Performance tests. Upon restart recieved:
Cluster Log
2006-01-28 20:07:20 [MgmSrvr] ALERT -- Node 2: Forced node shutdown completed. Occured during startphase 4. Initiated by signal 0. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error
2006-01-28 20:07:20 [MgmSrvr] ALERT -- Node 3: Forced node shutdown completed. Occured during startphase 4. Initiated by signal 0. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error
2006-01-28 20:10:06 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 reserved for ip 10.100.1.160, m_reserved_nodes 0000000000000006.
2006-01-28 20:10:07 [MgmSrvr] INFO -- Node 1: Node 2 Connected
2006-01-28 20:10:08 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 freed, m_reserved_nodes 0000000000000002.
2006-01-28 20:10:13 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 reserved for ip 10.100.1.161, m_reserved_nodes 000000000000000a.
2006-01-28 20:10:15 [MgmSrvr] INFO -- Node 1: Node 3 Connected
2006-01-28 20:10:15 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 freed, m_reserved_nodes 0000000000000002.
NDB Error logs:
Time: Saturday 28 January 2006 - 20:07:20
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbtupDiskAlloc.cpp
Error object: DBTUP (Line: 943) 0x0000000a
Program: /home/ndbdev/jmiller/builds/libexec/ndbd
Pid: 28725
Trace: /space/run/ndb_2_trace.log.9
Version: Version 5.1.6 (alpha)
***EOM***
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 2, r.sigId: 76580318 gsn: 164 "CONTINUEB" prio: 1
s.bn: 247 "DBLQH", s.proc: 2, s.sigId: 76580315 length: 2 trace: 0 #sec: 0 fragInf: 0
H'00000006 H'00000000
--------------- Signal ----------------
r.bn: 260 "LGMAN", r.proc: 2, r.sigId: 76580317 gsn: 164 "CONTINUEB" prio: 1
s.bn: 260 "LGMAN", s.proc: 2, s.sigId: 76580313 length: 2 trace: 0 #sec: 0 fragInf: 0
H'00000002 H'00000000
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 2, r.sigId: 76580316 gsn: 164 "CONTINUEB" prio: 1
s.bn: 253 "NDBFS", s.proc: 2, s.sigId: 76580311 length: 1 trace: 0 #sec: 0 fragInf: 0
Scanning the memory channel again with no delay
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 2, r.sigId: 76580315 gsn: 262 "FSREADCONF" prio: 1
s.bn: 253 "NDBFS", s.proc: 2, s.sigId: 76580311 length: 1 trace: 0 #sec: 0 fragInf: 0
UserPointer: 1
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 2, r.sigId: 76580314 gsn: 272 "FSWRITEREQ" prio: 0
s.bn: 260 "LGMAN", s.proc: 2, s.sigId: 76580313 length: 7 trace: 0 #sec: 0 fragInf: 0
UserPointer: 0
FilePointer: 22
UserReference: H'01040002 Operation flag: H'00000003 (No sync, Format=List of global pages)
varIndex: 3861
numberOfPages: 11
Time: Saturday 28 January 2006 - 20:12:10
Status: Temporary error, restart node
Message: Arbitrator shutdown, please investigate error(s) on other node(s) (Arbitration error)
Error: 2305
Error data: Arbitrator decided to shutdown this node
Error object: QMGR (Line: 3826) 0x0000000a
Program: /home/ndbdev/jmiller/builds/libexec/ndbd
Pid: 27466
Trace: /space/run/ndb_3_trace.log.8
Version: Version 5.1.6 (alpha)
***EOM***
--------------- Signal ----------------
r.bn: 252 "QMGR", r.proc: 3, r.sigId: 9497631 gsn: 343 "PREP_FAILCONF" prio: 0
s.bn: 252 "QMGR", s.proc: 3, s.sigId: 9497630 length: 2 trace: 8 #sec: 0 fragInf: 0
H'00000003 H'00000002
--------------- Signal ----------------
r.bn: 252 "QMGR", r.proc: 3, r.sigId: 9497630 gsn: 126 "CLOSE_COMCONF" prio: 0
s.bn: 254 "CMVMI", s.proc: 3, s.sigId: 9497629 length: 19 trace: 8 #sec: 0 fragInf: 0
xxxBlockRef = (252, 3) failNo = 2 noOfNodes = 1
Nodes: 2
--------------- Signal ----------------
r.bn: 254 "CMVMI", r.proc: 3, r.sigId: 9497629 gsn: 127 "CLOSE_COMREQ" prio: 0
s.bn: 252 "QMGR", s.proc: 3, s.sigId: 9497628 length: 5 trace: 8 #sec: 0 fragInf: 0
xxxBlockRef = (252, 3) failNo = 2 noOfNodes = 1
Nodes: 2
--------------- Signal ----------------
r.bn: 252 "QMGR", r.proc: 3, r.sigId: 9497628 gsn: 345 "PREP_FAILREQ" prio: 0
s.bn: 252 "QMGR", s.proc: 3, s.sigId: 9497627 length: 5 trace: 8 #sec: 0 fragInf: 0
xxxBlockRef = (252, 3) failNo = 2 noOfNodes = 1
Nodes: 2
--------------- Signal ----------------
r.bn: 252 "QMGR", r.proc: 3, r.sigId: 9497627 gsn: 153 "COMMIT_FAILCONF" prio: 0
s.bn: 252 "QMGR", s.proc: 3, s.sigId: 9497626 length: 1 trace: 8 #sec: 0 fragInf: 0
H'00000003
Second Try:
Time: Saturday 28 January 2006 - 20:12:09
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbtupDiskAlloc.cpp
Error object: DBTUP (Line: 943) 0x0000000a
Program: /home/ndbdev/jmiller/builds/libexec/ndbd
Pid: 28814
Trace: /space/run/ndb_2_trace.log.10
Version: Version 5.1.6 (alpha)
--------------- Signal ----------------
r.bn: 261 "PGMAN", r.proc: 2, r.sigId: 9489435 gsn: 164 "CONTINUEB" prio: 1
s.bn: 261 "PGMAN", s.proc: 2, s.sigId: 9489433 length: 1 trace: 0 #sec: 0 fragInf: 0
H'00000001
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 2, r.sigId: 9489434 gsn: 164 "CONTINUEB" prio: 1
s.bn: 253 "NDBFS", s.proc: 2, s.sigId: 9489430 length: 1 trace: 0 #sec: 0 fragInf: 0
Scanning the memory channel again with no delay
--------------- Signal ----------------
r.bn: 261 "PGMAN", r.proc: 2, r.sigId: 9489433 gsn: 262 "FSREADCONF" prio: 1
s.bn: 253 "NDBFS", s.proc: 2, s.sigId: 9489430 length: 1 trace: 0 #sec: 0 fragInf: 0
UserPointer: 3455
--------------- Signal ----------------
r.bn: 261 "PGMAN", r.proc: 2, r.sigId: 9489432 gsn: 164 "CONTINUEB" prio: 0
s.bn: 261 "PGMAN", s.proc: 2, s.sigId: 9489429 length: 1 trace: 0 #sec: 0 fragInf: 0
H'00000002
--------------- Signal ----------------
r.bn: 252 "QMGR", r.proc: 2, r.sigId: 9489431 gsn: 164 "CONTINUEB" prio: 0
s.bn: 252 "QMGR", s.proc: 2, s.sigId: 9489428 length: 1 trace: 0 #sec: 0 fragInf: 0
H'00000004
--------------- Signal ----------------
How to repeat:
Create 6 million row database, run performance test, restart cluster.