Bug #58453 Incorrectly handling of GSN_COMMIT/GSN_COMPLETE during take-over
Submitted: 24 Nov 2010 11:13 Modified: 25 Nov 2010 22:53
Reporter: Jonas Oreland Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[24 Nov 2010 11:13] Jonas Oreland
Description:
During TC take-over, the take-over-TC node will ask all LQH for transaction
originating from crashing TC. Using LQH_TRANSREQ/LQH_TRANSCONF

If a COMMIT/COMPLETE came from other node in node-group to a node
that had completed LQH_TRANSREQ, an extra LQH_TRANSCONF could be emitted
leading to either a crash in LocalProxy or DBTC

This bug has been around since forever, and likelyhood for it to happen
has increased somewhat with the introduction of ndbmtd.
(from impossible to impossible :)

How to repeat:
new test prg

Suggested fix:
transactions that has already been sent to DBTC
should not be sent again...
[24 Nov 2010 12:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/124851

3343 Jonas Oreland	2010-11-24
      ndb - bug#58453 fix execCOMMIT/execCOMPLETE wrt execLQH_TRANSREQ
[24 Nov 2010 12:08] Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.40 (revid:jonas@mysql.com-20101124120652-7fqswrdhoebj6v0o) (version source revid:jonas@mysql.com-20101124120652-7fqswrdhoebj6v0o) (merge vers: 5.1.51-ndb-6.3.40) (pib:23)
[24 Nov 2010 12:20] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.21 (revid:jonas@mysql.com-20101124121655-qc422lt7id0guhiy) (version source revid:jonas@mysql.com-20101124121655-qc422lt7id0guhiy) (merge vers: 5.1.51-ndb-7.0.21) (pib:23)
[24 Nov 2010 12:20] Jonas Oreland
pushed to 6.3.40, 7.0.21 and 7.1.10
[24 Nov 2010 17:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/124852

3344 Jonas Oreland	2010-11-24
      ndb - bug#58453 fix execCOMMIT/execCOMPLETE wrt execLQH_TRANSREQ
[25 Nov 2010 22:53] Jon Stephens
Documented bugfix in the NDB-6.3.40, 7.0.21, and 7.1.10 changelogs, as follows:

        During a node takeover, it was possible in some circumstances
        for one of the remaining nodes to send an extra transaction
        confirmation (LQH_TRANSCONF) signal to the BDTC kernel block,
        conceivably leading to a crash of the data node taking over as
        the new transaction coordinator.

Closed.