Bug #11218 | Error message Internal program error (failed ndbrequire) Fault ID: 2341 | ||
---|---|---|---|
Submitted: | 9 Jun 2005 19:14 | Modified: | 1 Sep 2005 14:38 |
Reporter: | Jonathan Miller | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S3 (Non-critical) |
Version: | 4.1.12,5.0, 5.1.0 | OS: | Linux (Linux) |
Assigned to: | Tomas Ulin | CPU Architecture: | Any |
[9 Jun 2005 19:14]
Jonathan Miller
[24 Jul 2005 21:50]
Jonathan Miller
Has also been reported by a customer: What is the correct procedure for recovering a failed NDBD node? It seems that in my experience, every time a server running an NDBD node is shutdown uncleanly (due to crash or power failure or ...) NDBD refuses to start on reboot and I have to restore the database from backup. Not good! This morning I found one of our servers (just a development box, fortunately) had hanged itself overnight. On reboot, NDBD would not start (could not alloc node id) until I did a PURGE STALE SESSIONS with ndb_mgm. Now it starts and gets to phase 5 for about 20 seconds and then exits. The error_log shows only the following cryptic message: Date/Time: x 24 July 2005 - 11:14:54 Type of error: error Message: Internal program error (failed ndbrequire) Fault ID: 2341 Problem data: Dbdict.cpp Object of reference: DBDICT (Line: 11636) 0x0000000a ProgramName: /usr/sbin/ndbd ProcessID: 5042 TraceFile: /var/lib/mysql-cluster/ndb_3_trace.log.3 Version 4.1.12 ***EOM*** Trace log: http://www.expio.co.nz/~sgarner/misc/ndb_3_trace.log.3.gz The node is part of a 2-server, 2-node cluster with 2 replicas (+ a 3rd machine as mgm). The other node, and the cluster, is still operational. Why can't the failed node repair itself from the working node? Should I be using --initial? I think I've tried that before in similar circumstances and just ended up losing the whole cluster. So before I do that, I'd like to know if there's anything else I can try. Unless I missed something, the manual is a little sparse on the topic of recovering a failed NDB, so I'd appreciate any help. thanks -Simon
[31 Aug 2005 17:11]
Tomas Ulin
error messages on filesystem issues have been cleaned up from 4.1.15, 5.0.12
[1 Sep 2005 14:38]
Paul DuBois
Noted in 4.1.15, 5.0.12 changelogs.
[12 Oct 2005 15:48]
Jeff Schachter
This happens to me with 4.1.14 - I get: Date/Time: Wednesday 12 October 2005 - 11:46:14 Type of error: error Message: Internal program error (failed ndbrequire) Fault ID: 2341 Problem data: Dbdict.cpp Object of reference: DBDICT (Line: 11762) 0x00000002 ProgramName: /apps/mysql41/bin/ndbd ProcessID: 31682 TraceFile: /apps/mysql41/cluster/ndb_2_trace.log.5 Version 4.1.14 ***EOM*** Is there somewhere I can find an explanation of this problem?