Bug #25023 error 2341, data node failure in start phase 8
Submitted: 13 Dec 2006 1:43 Modified: 20 Dec 2006 17:27
Reporter: Sean Pringle Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.0.27 OS:Any
Assigned to: Assigned Account CPU Architecture:Any

[13 Dec 2006 1:43] Sean Pringle
Description:
Data Node 2 fails in start phase 8 during full Cluster restart after failed heartbeats (due to system/network load) caused too many nodes to be declared dead.

Subsequent partial cluster restart without Node 2 was successful.

How to repeat:
Cluster Log

2006-12-12 23:46:29 [MgmSrvr] INFO -- Node 4: Start phase 7 completed (system restart)
2006-12-12 23:46:29 [MgmSrvr] INFO -- Node 1: DICT: index 25 rebuild done
2006-12-12 23:46:29 [MgmSrvr] INFO -- Node 1: DICT: index 26 rebuild done
2006-12-12 23:46:29 [MgmSrvr] INFO -- Node 1: DICT: index 32 rebuild done
2006-12-12 23:46:29 [MgmSrvr] INFO -- Node 1: DICT: index 33 rebuild done
2006-12-12 23:46:29 [MgmSrvr] INFO -- Node 1: DICT: index 34 rebuild done
2006-12-12 23:46:29 [MgmSrvr] INFO -- Node 1: DICT: index 46 rebuild done
2006-12-12 23:46:29 [MgmSrvr] INFO -- Node 1: DICT: index 56 rebuild done
2006-12-12 23:46:30 [MgmSrvr] INFO -- Node 1: DICT: index 61 rebuild done
2006-12-12 23:46:30 [MgmSrvr] INFO -- Node 1: DICT: index 64 rebuild done
2006-12-12 23:46:30 [MgmSrvr] INFO -- Node 1: DICT: index 65 rebuild done
2006-12-12 23:46:30 [MgmSrvr] INFO -- Node 1: Local checkpoint 13335 started. Keep GCI = 2032363 oldest restorable GCI = 2032363
2006-12-12 23:49:49 [MgmSrvr] INFO -- Node 1: DICT: index 66 rebuild done
2006-12-12 23:49:50 [MgmSrvr] INFO -- Node 17: Node 2 Connected
2006-12-12 23:49:50 [MgmSrvr] ALERT -- Node 2: Forced node shutdown completed. Occured during startphase 8. Initiated by signal 0. Caused by error 2341: 'Internal program error (failed ndbrequir
e)(Internal error, programming error or missing error message, please report a bug). Temporary error
2006-12-12 23:49:51 [MgmSrvr] INFO -- Node 17: Node 4 Connected
2006-12-12 23:49:51 [MgmSrvr] ALERT -- Node 4: Forced node shutdown completed. Occured during startphase 8. Initiated by signal 0. Caused by error 2308: 'Another node failed during system restar
t, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'.
2006-12-12 23:49:53 [MgmSrvr] INFO -- Node 17: Node 1 Connected
2006-12-12 23:49:54 [MgmSrvr] ALERT -- Node 1: Forced node shutdown completed. Occured during startphase 8. Initiated by signal 0. Caused by error 2308: 'Another node failed during system restar
t, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'.
2006-12-12 23:49:55 [MgmSrvr] INFO -- Node 17: Node 3 Connected
2006-12-12 23:49:56 [MgmSrvr] ALERT -- Node 3: Forced node shutdown completed. Occured during startphase 8. Initiated by signal 0. Caused by error 2308: 'Another node failed during system restar
t, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'.

NDB error log Node 2

Time: Tuesday 12 December 2006 - 23:49:32
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbtupIndex.cpp
Error object: DBTUP (Line: 514) 0x0000000a
Program: /usr/sbin/ndbd
Pid: 27654
Trace: /server/mysql/ndb_2_trace.log.11
Version: Version 5.0.27
***EOM***

See ndb_2_trace.log.11 trace file compressed and attached.
[13 Dec 2006 1:44] Sean Pringle
trace log

Attachment: ndb_2_trace.log.11.zip (application/zip, text), 121.52 KiB.

[20 Dec 2006 17:27] Jonas Oreland
Hi,

I've not seen complete cluster log, but looking at code+trace makes me 99% sure
this is http://bugs.mysql.com/bug.php?id=15303, which was fixed in 5.0.29 

Closing this as duplicate

/Jonas