Bug #26293 cluster mgmt node sometimes doesn't receive events from all nodes on restart
Submitted: 12 Feb 2007 17:28 Modified: 26 Feb 2007 2:17
Reporter: Hartmut Holzgraefe Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.1.14-ndb-6.1.0 OS:Linux (linux)
Assigned to: Tomas Ulin CPU Architecture:Any

[12 Feb 2007 17:28] Hartmut Holzgraefe
Description:
Sometimes when a management node is restarted all data nodes connect to it (logged in the cluster log and visible using netstat) but some nodes do not actually log events to the management node. A 2nd management node logs events from all nodes just fine at the same time. Restarting a data node seems to resolve the situation once the node is down.

How to repeat:
will be added soon ...
[12 Feb 2007 18:18] Hartmut Holzgraefe
additional information: stopping a data node does seem to resolve this, 
staring with the

  Node x: Node shutdown completed.

INFO message. After this all nodes log to the management node just fine.
[13 Feb 2007 4:49] Tomas Ulin
patch for bug

Attachment: bug26293.patch (text/x-patch), 5.35 KiB.

[13 Feb 2007 6:36] Tomas Ulin
new patch against 5.0, with fixes also for other send signal

Attachment: bug26293_5.0_2.patch (text/x-patch), 6.91 KiB.

[13 Feb 2007 21:00] Jonas Oreland
review.
1) okToSend (unCond = true, should check m_api_regconf)
            (unCond = false, should check alive)

   this needs to be fixed...

2) the only mgm function that I *know* needs unCond=false(i.e alive) is backup

3) please install diff-p helper, so that diff gets easier to read.

4) the state stuff is very very bad for a user (such as ndb_mgmd)
   and the interface you proposed instead sounded very good...
   we should document it somewhere so we dont forget it..
   (maybe comment in code)

5) please add an assert in TransporterFacade that
   checks that only GSN_APIREGREQ is allowed to be sent if m_api_regconf = false

   this assert would find this bug directly...

/Jonas
[14 Feb 2007 3:01] Tomas Ulin
yet another patch

Attachment: bug26293_5.0_3.patch (text/x-patch), 7.23 KiB.

[14 Feb 2007 4:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/19822
[14 Feb 2007 7:00] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/19842
[21 Feb 2007 15:09] Tomas Ulin
5.0.37, 5.1.16, ndb-6.1.3
[26 Feb 2007 2:17] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented bugfix in 5.0.38, 5.1.16, and ndb-6.1.3 changelogs.