Bug #36276 Race condition between EMPTY_LCPCONF and MASTER_LCPREQ
Submitted: 23 Apr 2008 8:17 Modified: 31 May 2008 10:48
Reporter: Jonas Oreland Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[23 Apr 2008 8:17] Jonas Oreland
Description:
During master failure, there is a small time window
where a LCP can complete, and EMPTY_LCPCONF not having arrived
which can cause LCP_COMPLETE_REP to be sent twice to new master

This causes it to crash.

How to repeat:
new test prg

Suggested fix:
.
[24 Apr 2008 9:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/45936

ChangeSet@1.2202, 2008-04-24 11:29:32+02:00, jonas@perch.ndb.mysql.com +7 -0
  ndb - (drop6)
    fix for bug#36199, bug#36246, bug#36247, bug#36276
    all related to cascading master failure
[25 Apr 2008 7:56] Jonas Oreland
pushed to 51-ndb, telco* and drop6
(50-ndb was locked for unknown reason)
[20 May 2008 9:34] Jon Stephens
Documented in the 5.1.24-ndb-6.3.14 changelog as follows:

        Under certain rare circumstances, the failure of the new master node
        while attempting a node takeover would cause takeover errors to repeat
        without being resolved.

Left Patch Queued status pending further merges.
[31 May 2008 10:48] Jon Stephens
Closed per yesterday's discussion with Jonas.
[12 Dec 2008 23:30] Bugs System
Pushed into 6.0.6-alpha  (revid:sp1r-jonas@perch.ndb.mysql.com-20080423140838-48946) (version source revid:jonas@mysql.com-20080808094047-4e1yiarqa2t3opg3) (pib:5)