Bug #29021 Cluster failure on node recovery.
Submitted: 11 Jun 2007 16:19 Modified: 29 Feb 2008 11:27
Reporter: Matthew Montgomery Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.1.4-a_drop5p19 OS:Linux (2.6.18-8.el5)
Assigned to: CPU Architecture:Any
Tags: cluster, president

[11 Jun 2007 16:19] Matthew Montgomery
Description:
Config : 10 data node cluster.  (5 groups of 2)
C=3, D=4, E=5, F=6, M=7, N=8, O=9, P=10, Q=11, R=12

Stations D, F, N, P, R are stopped.

Node 4 attempts recovery and fails in phase 1.
Node 3 is 'president'.
Node 3 dies as Node 4 dies.

Cluster fails as all nodes in group 1 dead.

Node 4 repeated Qmgr::execNODE_VERSION_REP a few dozen times then executed Qmgr::execCM_NODEINFOCONF which is the point at which it crashes.

Node 3 dies in Qmgr::execCM_ACKADD

Trace files for this crash are included with associated csc issue.

How to repeat:
.

Suggested fix:
.
[1 Mar 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".