MySQL Bugs: #40697: Transactions can be pending longer than needed at node-failure

Bug #40697	Transactions can be pending longer than needed at node-failure
Submitted:	13 Nov 2008 13:01	Modified:	14 Nov 2008 8:36
Reporter:	Jonas Oreland	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	*	OS:	Any
Assigned to:	Jonas Oreland	CPU Architecture:	Any

Description:
If a ndbapi-application(or mysqld ofcourse) has a outstanding transaction with a transaction coordinator that fails (node-failure), then surviving will process the failure and then inform the waiting ndbapi.

Currently, the surviving nodes will not inform that ndbapi until the failure has been processed in *all* protocols, when the ndbapi is only interesting with the failure-handling of the transaction protocol (TC-take-over).

How to repeat:
.

Suggested fix:
Inform ndbapi when the TC-take-over has completed,
so that it can carry on quicker.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/58633

2733 Jonas Oreland	2008-11-13
      ndb - bug#40697 - transaction can wait longer needed during node-failure-handling

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/58635

2734 Jonas Oreland	2008-11-13
      ndb - bug#40697 - transaction can wait longer needed during node-failure-handling

Pushed into 5.1.29-ndb-6.2.17  (revid:jonas@mysql.com-20081113131556-ka7b01usk25cqbug) (version source revid:jonas@mysql.com-20081113131556-ka7b01usk25cqbug) (pib:5)

Pushed into 5.1.29-ndb-6.3.19  (revid:jonas@mysql.com-20081113131556-ka7b01usk25cqbug) (version source revid:jonas@mysql.com-20081113132059-k9ud1vk9xxbzsp1v) (pib:5)

Pushed into 5.1.29-ndb-6.4.0  (revid:jonas@mysql.com-20081113131556-ka7b01usk25cqbug) (version source revid:jonas@mysql.com-20081113133629-doa4bx0p1a061bvx) (pib:5)

Documented bugfix in the NDB-6.2.17 and NDB-6.3.19 changelogs as follows:

        Transaction failures took longer to handle than was necessary.

        When a data node acting as transaction coordinator (TC) failed,
        the surviving data nodes did not inform the API node initiating
        the transaction of this until the failure had been processed by
        all protocols, when the API node needed only to know about
        failure handling by the transaction protocol -- that is, it
        needed to be informed only about the TC takeover process. Now,
        API nodes (including MySQL servers acting as cluster SQL nodes)
        are informed as soon as the TC takeover is complete, so that it
        can carry on operating more quickly.

Pushed into 6.0.9-alpha  (revid:jonas@mysql.com-20081113131556-ka7b01usk25cqbug) (version source revid:tomas.ulin@sun.com-20081209185954-9svcixh2p5hsfi6w) (pib:5)