Bug #22316 | Four-node cluster loses Node Group 1 | ||
---|---|---|---|
Submitted: | 13 Sep 2006 15:49 | Modified: | 26 Oct 2006 17:22 |
Reporter: | Steve Wolf | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | 5.0.24a | OS: | Linux (CentOS 4.4 x86_64) |
Assigned to: | CPU Architecture: | Any | |
Tags: | cluster, node groups |
[13 Sep 2006 15:49]
Steve Wolf
[13 Sep 2006 16:06]
Steve Wolf
Node Group 0 cluster log and trace files
Attachment: bug22316_logs_group0.tar.gz (application/x-gzip, text), 177.33 KiB.
[13 Sep 2006 16:07]
Steve Wolf
Node Group 1 cluster log and trace files
Attachment: bug22316_logs_group1.tar.gz (application/x-gzip, text), 53.26 KiB.
[13 Sep 2006 16:08]
Steve Wolf
Management Node log and configuration files
Attachment: bug22316_logs_mgmt.tar.gz (application/x-gzip, text), 16.68 KiB.
[13 Sep 2006 17:03]
Steve Wolf
In rebuilding the cluster, I see that all nodes show as Nodegroup 0 before the cluster is brought up. So this bug simplifies to the errors when bringing up the cluster. They claim to be temporary, but happen every time.
[26 Sep 2006 17:22]
Jonas Oreland
Hi, Reading your cluster log I made the following conclustions: * system restart fails as you dont start all 4 nodes fast enough... With default setting you have 30s for allowing nodes to get in contact with each other. Otherwise: As stated in ndb_1_error.log -- Time: Wednesday 13 September 2006 - 00:38:07 Status: Temporary error, restart node Message: Conflict when selecting restart type (Internal error, programming error or missing error message, please report a bug) Error: 2311 Error data: Unable to start missing node group! starting: 0000000000000006 (missing fs for: 0000000000000000) Error object: QMGR (Line: 1356) 0x0000000a Program: /usr/local/mysql/bin/ndbd Pid: 24521 Trace: /usr/local/mysql/data/ndb_1_trace.log.2 Version: Version 5.0.24 ***EO -- This means that node 1 & 2 has connected...but an entire node group is missing, as indicated by "Error data: Unable to start missing node group! starting: 0000000000000006" Please respond to wheater this can be a correct conclusion. (Note if performing initial start, this timeout is unlimited...) /Jonas
[26 Oct 2006 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".