Bug #40697 Transactions can be pending longer than needed at node-failure
Submitted: 13 Nov 2008 13:01 Modified: 14 Nov 2008 8:36
Reporter: Jonas Oreland Email Updates:
Status: Closed Impact on me:
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:* OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[13 Nov 2008 13:01] Jonas Oreland
If a ndbapi-application(or mysqld ofcourse) has a outstanding transaction with a transaction coordinator that fails (node-failure), then surviving will process the failure and then inform the waiting ndbapi.

Currently, the surviving nodes will not inform that ndbapi until the failure has been processed in *all* protocols, when the ndbapi is only interesting with the failure-handling of the transaction protocol (TC-take-over).

How to repeat:

Suggested fix:
Inform ndbapi when the TC-take-over has completed,
so that it can carry on quicker.
[13 Nov 2008 13:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:


2733 Jonas Oreland	2008-11-13
      ndb - bug#40697 - transaction can wait longer needed during node-failure-handling
[13 Nov 2008 13:11] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:


2734 Jonas Oreland	2008-11-13
      ndb - bug#40697 - transaction can wait longer needed during node-failure-handling
[13 Nov 2008 13:33] Bugs System
Pushed into 5.1.29-ndb-6.2.17  (revid:jonas@mysql.com-20081113131556-ka7b01usk25cqbug) (version source revid:jonas@mysql.com-20081113131556-ka7b01usk25cqbug) (pib:5)
[13 Nov 2008 13:34] Bugs System
Pushed into 5.1.29-ndb-6.3.19  (revid:jonas@mysql.com-20081113131556-ka7b01usk25cqbug) (version source revid:jonas@mysql.com-20081113132059-k9ud1vk9xxbzsp1v) (pib:5)
[13 Nov 2008 13:35] Bugs System
Pushed into 5.1.29-ndb-6.4.0  (revid:jonas@mysql.com-20081113131556-ka7b01usk25cqbug) (version source revid:jonas@mysql.com-20081113133629-doa4bx0p1a061bvx) (pib:5)
[14 Nov 2008 8:36] Jon Stephens
Documented bugfix in the NDB-6.2.17 and NDB-6.3.19 changelogs as follows:

        Transaction failures took longer to handle than was necessary.

        When a data node acting as transaction coordinator (TC) failed,
        the surviving data nodes did not inform the API node initiating
        the transaction of this until the failure had been processed by
        all protocols, when the API node needed only to know about
        failure handling by the transaction protocol -- that is, it
        needed to be informed only about the TC takeover process. Now,
        API nodes (including MySQL servers acting as cluster SQL nodes)
        are informed as soon as the TC takeover is complete, so that it
        can carry on operating more quickly.
[12 Dec 2008 23:29] Bugs System
Pushed into 6.0.9-alpha  (revid:jonas@mysql.com-20081113131556-ka7b01usk25cqbug) (version source revid:tomas.ulin@sun.com-20081209185954-9svcixh2p5hsfi6w) (pib:5)