Bug #38930 Crash in DBTC during node-failure
Submitted: 21 Aug 2008 8:44 Modified: 5 Oct 2008 18:32
Reporter: Jonas Oreland
Status: Closed
Category:Server: Cluster Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: Jonas Oreland Target Version:

[21 Aug 2008 8:44] Jonas Oreland
Description:
During TC-take over (i.e directly after node failure)
An LQH finding an operation in state LOG_COMMIT* will send
a LQH_TRANS_CONF twice, causing take-over TC to die.

How to repeat:
run testNodeRestart -n Bug34216 in endless loop

Suggested fix:
.
[21 Aug 2008 8:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/52175

2654 Jonas Oreland	2008-08-21
      ndb - fix bug#38930
[22 Aug 2008 0:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/52258

2656 Jonas Oreland	2008-08-22
      ndb - fix testIndex -n SR1
        1) use Uint8 instead of char for correct sign handling in restore.cpp (6.2 only)
        2) always include PK in "random" unique index generation
[12 Sep 2008 9:43] Jon Stephens
Documented in the NDB 6.2.16 and 6.3.17 changelogs as follows:

        During transactional coordinator takeover (directly after node
        failure), the LQH finding an operation in the LOG_COMMIT state
        sent an LQH_TRANS_CONF signal twice, causing the TC to fail.
[5 Oct 2008 18:32] Jon Stephens
Already documented for relevant trees; closed.
[13 Dec 2008 0:27] Bugs System
Pushed into 6.0.7-alpha  (revid:jonas@mysql.com-20080821064702-7sp53eymzrsu30aj) (version
source revid:tomas.ulin@sun.com-20080902154454-pvi3xa61d2wtxtbg) (pib:5)