Bug #18043 | Reboot cause node failure on other server in cluster | ||
---|---|---|---|
Submitted: | 7 Mar 2006 17:22 | Modified: | 19 Jun 2006 9:52 |
Reporter: | Andrew Harrison | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S2 (Serious) |
Version: | 5.0.18 | OS: | Linux (SLES 9 SP1) |
Assigned to: | CPU Architecture: | Any |
[7 Mar 2006 17:22]
Andrew Harrison
[7 Mar 2006 17:26]
Andrew Harrison
Trace log for node that caused the rest of the nodes to fail
Attachment: ndb_8_trace.log.zip (application/zip, text), 78.49 KiB.
[7 Mar 2006 17:27]
Andrew Harrison
Changed category to cluster
[9 Mar 2006 12:06]
Hartmut Holzgraefe
can you please add the cluster log (ndb_?_cluster.log) from the management node, too?
[9 Mar 2006 13:34]
Andrew Harrison
Cluster log from the Server that was not rebooted.
Attachment: ndb_2_cluster.zip (application/x-zip-compressed, text), 59.15 KiB.
[9 Mar 2006 13:35]
Andrew Harrison
Cluster log
Attachment: ndb_1_cluster.zip (application/x-zip-compressed, text), 33.01 KiB.
[9 Mar 2006 13:46]
Andrew Harrison
The set-up that we have is: Two WebSphere application servers running the node management daemon Two Servers running the MySQL daemon. The Application Servers and MySQL servers are a paired failover (i.e. AppServer1 primarily uses MySQLServer1, but fails over to MySQLServer2 and vice-versa.) The root cause: MyQSLServer1 suffered a message storm (originating in oictl32). We are going to upgrade the Kernel version soon. The message storm filled up the root filespace. The only action to get around this is to delete the syslog file and reboot the server. It would appear that rebooting the server causes the nodes on the other MySQL server to fail. In this instance, node 1 is the management daemon on AppServer1 and node 2 is the management daemon on AppServer2. Node 3,5,7 & 9 are the ndbd instances on MySQLServer1 and 4,6,8 & 10 are the ndbd instances on MySQLServer2. MySQLServer1 was rebooted (we could see when this happened as I was watching the Node Management Console when the server was being rebooted.) resulting in node 3,5,7 & 9 disappearing as expected. Very soon afterwards, node 4, 6, 8 & 10 also die unexpectedly. Hope this helps
[19 May 2006 9:52]
Jonas Oreland
Hi, Can you upload you config.ini and error/trace files of the other ndbd nodes that crashed. Also, a number of bug fixes in this area has been fixed since 5.0.18, can you try a newer version? /Jonas
[19 Jun 2006 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".