Bug #31257 Node failure during LCP can lead to subsequent SR failure
Submitted: 27 Sep 2007 19:49 Modified: 6 Nov 2007 8:47
Reporter: Jonas Oreland Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:4.1,5.0,5.1 OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[27 Sep 2007 19:49] Jonas Oreland
Description:
5.0
2 node cluster (A,B)
Start DML load
During LCP X
Let B stop writing UNDO ACC (or TUP)             - using error insert
Kill B before END_LCPREQ is sent to ACC (or TUP) - using error insert

Let A process one more LCP
kill A

A will now restart at X+1
B will restart at X

BUT! X was never completed
i.e undo never got written, but all fragments were "complete"
and B will crash when trying to apply undo to X

---

Same problem occurs in 5.1 (not for MM, cause it has no UNDO)
but for DD

How to repeat:
.

Suggested fix:
1) Dont use LCP until LCP_COMPLETE_REP has arrived
2) Change so that each fragment handles END_LCP_REQ (or similar)

Short term solution is 1,
A better long term solution is 2
[29 Sep 2007 9:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/34676

ChangeSet@1.2584, 2007-09-29 11:29:17+02:00, jonas@perch.ndb.mysql.com +2 -0
  ndb - bug#31257
    handle partially complete LCP better in SR
[6 Oct 2007 17:23] Jon Stephens
Documented bugfix in mysql-5.1.15-ndb-6.1.21 changelog.

Left status as Patch Pending.
[8 Oct 2007 13:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35103

ChangeSet@1.2515, 2007-10-08 15:57:01+02:00, jonas@perch.ndb.mysql.com +2 -0
  ndb - bug#31257
      handle partially complete LCP better in SR
[9 Oct 2007 6:14] Jonas Oreland
pushed to 51-ndb, telco-6.2, telco-6.3 and 51-telco
[10 Oct 2007 8:18] Jon Stephens
Also documented in mysql-5.1.22-ndb-6.2.7 changelog; left status as PQ.
[15 Oct 2007 17:40] Jon Stephens
Also documented in mysql-5.1.22-ndb-6.3.4 changelog; left status as PQ.
[5 Nov 2007 13:53] Bugs System
Pushed into 6.0.4-alpha
[5 Nov 2007 13:56] Bugs System
Pushed into 5.1.23-rc
[6 Nov 2007 8:47] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Fix also documented in 5.1.23 and 6.0.4 changelogs; closed.