Bug #44291 LocalProxy does not forward signals NODE_STATE_REP and CHANGE_NODE_STATE_REQ
Submitted: 15 Apr 2009 12:40 Modified: 16 Apr 2009 19:18
Reporter: Ole John Aske Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:7.0 OS:Any
Assigned to: Ole John Aske CPU Architecture:Any

[15 Apr 2009 12:40] Ole John Aske
Description:
The signals GSN_NODE_STATE_REP and GSN_CHANGE_NODE_STATE_REQ are used during gracefull shutdown in order to change NodeState.startLevel. In particular the startLevel of the NodeState is set to SL_STOPPING_4 in order to prevent LQH from writing Global Checkpoint during a controlled shutdown.

As the above signals is not handled by the LQH proxy, NodeState::SL_STOPPING_4 never becomes visible at mt-LQH's. This breake the gracefull stopping of the GCP
mechanism and may cause inconsistent REDO logs to be produced (According to Jonas)

This may be related to bug#42564

How to repeat:
Verified by reading code

Suggested fix:
Implement handling of the above signals in LocalProxy.*pp
[15 Apr 2009 13:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/72162

2962 Ole John Aske	2009-04-15
      bug#44291 LocalProxy should forward signals NODE_STATE_REP and CHANGE_NODE_STATE_REQ
[15 Apr 2009 13:14] Ole John Aske
As I will be on vacation for the next week, (until 26/4) the patch will not be commited until 27/4 earliest.
[16 Apr 2009 8:48] Jonas Oreland
review comments (or really patch):
change so that LocalProxy block also gets the signals.
[16 Apr 2009 8:49] Jonas Oreland
changes to committed patch

Attachment: bug44291.review.patch (text/x-patch), 2.20 KiB.

[16 Apr 2009 8:50] Jonas Oreland
discovered that this bug/fix
can be reproduced using "testSystemRestart -n to I3"
[16 Apr 2009 8:56] Bugs System
Pushed into 5.1.32-ndb-7.0.6 (revid:jonas@mysql.com-20090416085227-bua6o7t3id2j5aiw) (version source revid:jonas@mysql.com-20090416085227-bua6o7t3id2j5aiw) (merge vers: 5.1.32-ndb-7.0.6) (pib:6)
[16 Apr 2009 10:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/72237

2907 Jonas Oreland	2009-04-16
      ndb - bug#44291
        Node-state is not propageded to all blocks in ndbmtd
        This can cause
        - inconsistent REDO when doing graceful shutdown
        - crash in Dbtux, when restarting cluster, and some nodes
          needs to perform node-recovery during system-restart
      
        Possibly other problems...but they have not been identified
[16 Apr 2009 19:18] Jon Stephens
Documented bugfix in the NDB-7.0.6 changelog as follows:

        Node state information was not propagated to all blocks of
        ndbmtd. This could cause:

            ·Inconsistent redo logs when performing a graceful shutdown

            ·Data node crashes when later restarting the cluster, data
            nodes needing to perform node recovery during the system
            restart, or both.