MySQL Bugs: #29021: Cluster failure on node recovery.

Bug #29021	Cluster failure on node recovery.
Submitted:	11 Jun 2007 16:19	Modified:	29 Feb 2008 11:27
Reporter:	Matthew Montgomery	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	5.1.4-a_drop5p19	OS:	Linux (2.6.18-8.el5)
Assigned to:		CPU Architecture:	Any
Tags:	cluster, president

Description:
Config : 10 data node cluster.  (5 groups of 2)
C=3, D=4, E=5, F=6, M=7, N=8, O=9, P=10, Q=11, R=12

Stations D, F, N, P, R are stopped.

Node 4 attempts recovery and fails in phase 1.
Node 3 is 'president'.
Node 3 dies as Node 4 dies.

Cluster fails as all nodes in group 1 dead.

Node 4 repeated Qmgr::execNODE_VERSION_REP a few dozen times then executed Qmgr::execCM_NODEINFOCONF which is the point at which it crashes.

Node 3 dies in Qmgr::execCM_ACKADD

Trace files for this crash are included with associated csc issue.

How to repeat:
.

Suggested fix:
.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".