Bug #34702 Node failure(sp2) during initial node restart, can lead to subsequent failures
Submitted: 20 Feb 2008 19:32 Modified: 31 May 2008 10:35
Reporter: Jonas Oreland Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:* OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[20 Feb 2008 19:32] Jonas Oreland
Description:
Node failure during initial node restart (sp 2)
Followed by subsequent start
Can lead to crash of master node,
  as it incorrectly gave start node (second time)
  permission using START_PERMCONF even if invalidate node LCP was still running

How to repeat:
.

Suggested fix:
.
[20 Feb 2008 19:33] Jonas Oreland
likelihood increases if no of tables is "big", as invalidate node LCP
  will then take longer
[21 Feb 2008 9:22] Jonas Oreland
pushed to 6.2.13 (and tagged a release)
pending merge to 6.3

wont fix in 4.1,5.0,5.1
[22 Feb 2008 10:47] Jon Stephens
Documented in the 5.1.23-ndb-6.2.13 changelog as follows:

        A node failure during an initial node restart followed by
        another node start could cause the master data node to fail,
        because it incorrectly gave the node permission to start even if
        the invalidated node's LCP was still running.

Left in PQ status pending merge to ndb-6.3.
[31 May 2008 10:35] Jon Stephens
Also documented for 5.1.24-ndb-6.3.13 (actually pushed to ndb-6.1.11 but release was pulled and changelog entries re-tagged as 6.3.13). Closed per yesterday's discussion with Jonas.
[27 Jun 2008 8:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/48607

2632 jonas@mysql.com	2008-06-27
      ndb -
        increase timeout for testNodeRestart -n Bug34702 T1