Bug #47715 Race-condition between NODE_FAILREP and LQH_TRANSREQ
Submitted: 29 Sep 2009 13:54 Modified: 29 Sep 2009 15:04
Reporter: Jonas Oreland Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.3 OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[29 Sep 2009 13:54] Jonas Oreland
Description:
There is a infinitely small race-condition between NODE_FAILREP and LQH_TRANSREQ
(in the node-failure handling) which could lead to http://bugs.mysql.com/bug.php?id=41297

The likelihood increases when running ndbmtd (in 6.4)

How to repeat:
run testNodeRestart -n Bug41295, on pure ndbmtd cluster

Suggested fix:
use new sequence-functionality.
[29 Sep 2009 13:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85018

3073 Jonas Oreland	2009-09-29
      ndb - bug#47715 - ensure LQH_TRANSREQ is serialized correcttly wrt to NODE_FAILREP
[29 Sep 2009 14:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85019

3053 Jonas Oreland	2009-09-29 [merge]
      ndb - merge bug#47715
[29 Sep 2009 14:05] Jonas Oreland
pushed to 6.3.28, 7.0.9 and 7.1
won't fix in 6.2
[29 Sep 2009 15:04] Jon Stephens
Documented bugfix in the NDB-6.3.28 and 7.0.9 changelogs as follows:

        A very small race-condition between NODE_FAILREP and
        LQH_TRANSREQ signals when handling node failure could lead to
        operations (locks) not being taken over when they should have
        been, and subsequently becoming stale. This could lead to node
        restart failures, and applications getting into endless
        lock-conflicts with operations that were not released until the
        node was restarted.

        See also Bug #41297.

Closed.
[30 Sep 2009 8:13] Bugs System
Pushed into 5.1.37-ndb-6.3.28 (revid:jonas@mysql.com-20090930070741-13u316s7s2l7e1ej) (version source revid:jonas@mysql.com-20090929135923-kxknlucmksajjedj) (merge vers: 5.1.37-ndb-6.3.28) (pib:11)
[30 Sep 2009 8:14] Bugs System
Pushed into 5.1.37-ndb-7.0.9 (revid:jonas@mysql.com-20090930075942-1q6asjcp0gaeynmj) (version source revid:jonas@mysql.com-20090929140231-jyzkajkvvc1wkss2) (merge vers: 5.1.37-ndb-7.0.9) (pib:11)
[30 Sep 2009 8:15] Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:jonas@mysql.com-20090930080049-1c8a8cio9qgvhq35) (version source revid:jonas@mysql.com-20090929140336-qbe0o9hkxa6k4m18) (merge vers: 5.1.35-ndb-7.1.0) (pib:11)