MySQL Bugs: #49843: ndb_mgm_get_status() returning invalid node

Bug #49843	ndb_mgm_get_status() returning invalid node_group information for starting node
Submitted:	21 Dec 2009 9:59	Modified:	9 Jan 2015 14:42
Reporter:	Hartmut Holzgraefe	Email Updates:
Status:	Won't fix	Impact on me:	None
Category:	MySQL Cluster: NDB API	Severity:	S3 (Non-critical)
Version:	mysql-cluster-7.0.6	OS:	Linux
Assigned to:	Magnus Blåudd	CPU Architecture:	Any

Description:
ndb_mgm_get_status() returning invalid node_group information for starting node

here tested with a node of node group 1 in a four data node cluster:

first the node is up and running:

Node Id 4 Group 1 (0x1 ) Start phase 0 State: The node is running

now i stop it with "4 STOP":

Node Id 4 Group 1 (0x1 ) Start phase 1 State: The node is shutting down
Node Id 4 Group 1 (0x1 ) Start phase 2 State: The node is shutting down
Node Id 4 Group 1 (0x1 ) Start phase 3 State: The node is shutting down
Node Id 4 Group 1 (0x1 ) Start phase 4 State: The node is shutting down
Node Id 4 Group 0 (0x0 ) Start phase 0 State: The node cannot be contacted

So during the full shutdown procedure the group id is still shown correctly, only when
the node is fully gone it changes to 0

Now i start the node with "ndbd --nostart"

Node Id 4 Group 1 (0x1 ) Start phase 0 State: The node's status is not known
Node Id 4 Group -1 (0xFFFFFFFF ) Start phase 0 State: The node has not yet executed the startup protocol

Funny enough the node seems to show the correct node group briefly
before it changes to -1

Now i issue "4 START":

Node Id 4 Group -202116109 (0xF3F3F3F3 ) Start phase 0 State: The node is executing the startup protocol
Node Id 4 Group -202116109 (0xF3F3F3F3 ) Start phase 2 State: The node is executing the startup protocol
Node Id 4 Group -202116109 (0xF3F3F3F3 ) Start phase 4 State: The node is executing the startup protocol
Node Id 4 Group 1 (0x1 ) Start phase 100 State: The node is executing the startup protocol
Node Id 4 Group 1 (0x1 ) Start phase 0 State: The node is running

So first the group changes from -1 to a much larger negative value
before it changes to the actual group id somewhere between start
phases 4 and 100. As i was running this on an empty test installation
my mgmapi program was not fast enough to capture all start phases,
i'll give it another try on a larger installation later to check
the exact phase when the transition happens.

How to repeat:
check ndb_mgm_get_status() results for node_group on a starting node,
e.g. using the attached simple mgmapi monitoring app

Suggested fix:
Either consistently return -1 for node_group while a node is not active part of any node group or document node_group as only being defined for fully started nodes (node_status being either NDB_MGM_NODE_STATUS_STARTED or
NDB_MGM_NODE_STATUS_SINGLEUSER)