Bug #33900 | One NDB node is down and can not restart | ||
---|---|---|---|
Submitted: | 17 Jan 2008 16:44 | Modified: | 25 May 2012 9:00 |
Reporter: | Yann Le Rouzic | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S2 (Serious) |
Version: | 5.0.41 | OS: | Linux (RHEL 4 Update 5) |
Assigned to: | Jonas Oreland | CPU Architecture: | Any |
Tags: | Cluster NDB 2341 |
[17 Jan 2008 16:44]
Yann Le Rouzic
[17 Jan 2008 16:54]
Jonas Oreland
it looks like a corrupted table file, most likely "ndbd --initial" will do the trick, and I would save the filesystem *first*, also, exactly what is wrong is impossible to tell wo/ the tracefile /jonas
[23 Jan 2008 9:01]
Yann Le Rouzic
Any idea about the cause of this issue, using the log files?
[23 Jan 2008 10:14]
Jonas Oreland
Hi, yes, the starting node fails due as it can not allocate an Attribute (SQL Column) (config MaxNoOfAttributes) So my guess would be that you have done a config change of this variable "recently", where you updated ndb_mgmd, but have not tried to restart cluster. I recommend increasing this value. If you have *not* modified this value "recently", then it's a bug somewhere. Let me know if this helps /jonas
[23 Jan 2008 13:54]
Yann Le Rouzic
Thanks for your answer Jonas. Indeed, we have discovered that the config.ini file was not the same on both servers. We corrected it, but we encounter another error when restarting ndbd: Time: Wednesday 23 January 2008 - 14:48:23 Status: Temporary error, restart node Message: System error, node killed during node restart by other node (Internal error, programming error or missing error message, please report a bug) Error: 2303 Error data: Killed by node 3 as copyfrag failed, error: 827 Error object: NDBCNTR (Line: 196) 0x0000000a Program: /usr/local/mysql/5.0.41/bin/ndbd Pid: 9787 Trace: /data/mysql/5.0.41/data/ndb_4_trace.log.15 Version: Version 5.0.41 ***EOM*** I attached the ndb_4_trace.log.15 file to this ticket.
[23 Jan 2008 13:55]
Yann Le Rouzic
Trace file for error 2303
Attachment: ndb_4_trace.log.15.gz (application/gzip, text), 125.65 KiB.
[23 Jan 2008 14:01]
Jonas Oreland
sh> perror --ndb 827 NDB error code 827: Out of memory in Ndb Kernel, table data (increase DataMemory): Permanent error: Insufficient space I.e you have also changed DataMemory somehow in a incompatible way... /jonas
[23 Jan 2008 14:21]
Yann Le Rouzic
Since I changed the config.ini file, I guess that I have to run "ndbd --initial" on the node that failed. Is there a risk of corrupting the data on the other node?
[23 Jan 2008 16:30]
Jonas Oreland
no, that should be fine /jonas
[24 Jan 2008 14:12]
Yann Le Rouzic
Trace file after "ndbd --initial"
Attachment: ndb_4_trace.log.17.gz (application/gzip, text), 125.28 KiB.
[24 Jan 2008 14:13]
Yann Le Rouzic
Configs are now exactly the same on both servers, but after running "ndbd --initial" I still get the same error: Time: Thursday 24 January 2008 - 15:07:56 Status: Temporary error, restart node Message: System error, node killed during node restart by other node (Internal error, programming error or missing error message, please report a bug) Error: 2303 Error data: Killed by node 3 as copyfrag failed, error: 827 Error object: NDBCNTR (Line: 196) 0x0000000e Program: /usr/local/mysql/5.0.41/bin/ndbd Pid: 8515 Trace: /data/mysql/5.0.41/data/ndb_4_trace.log.17 Version: Version 5.0.41 ***EOM*** Trace file is provided
[3 Feb 2008 21:03]
Jonas Oreland
Hi, hmm...i would recommend increasing datamemory and maxnooftables /Jonas
[15 Nov 2008 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[25 May 2012 9:00]
Gustaf Thorslund
Looks like configuration error so !bug Since 5.0 is kind of history also unsupported now (but wasn't when bug was opened)