MySQL Bugs: #47818: ndb_mgm reports "not connected" for ndbmtd nodes in phase 1 or phase2

Bug #47818	ndb_mgm reports "not connected" for ndbmtd nodes in phase 1 or phase2
Submitted:	5 Oct 2009 0:08	Modified:	27 Jan 2010 7:39
Reporter:	John David Duncan	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	mysql-5.1-telco-6.3	OS:	Any
Assigned to:	Jonas Oreland	CPU Architecture:	Any
Tags:	7.0.7

Description:
For some time during node restart -- some part of restart phases 1 and 2 -- an "all status" command from ndb_mgm will falsely report "not connected" 

How to repeat:
Start a cluster and run "all status" repeatedly

sample screen capture during cluster restart

Attachment: mgm-status-bug.txt (text/plain), 6.46 KiB.

See also bug#48380 (which was marked duplicate of this bug report)

Quote from Cluster Development team, Magnus Blaudd:
"After fixing bug#48301, it should be possible to return CONNECTED(which is actually a new status for API and MGM) when asking status for the ndbd.
I.e in the case when node is not connected to ClusterMgr with transporter yet but has
allocated node id."

So I just tried to reproduce this on 7.0-bzr, wo/ succeeding.
Can you please retry, and be more specific on how you get this.
and if managing to reproduce, also attach config.ini and cluster log

Setting: Need feedback

just managed to reproduce this in 6.3 by using sufficiently large IndexMemory.
proposed patch series solves problem by moving allocation to global memory
and having acc consume it's memory from their during sp0 (instead of doing actual
malloc self)

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/98066

3089 Jonas Oreland	2010-01-25
      ndb - bug#47818 - Move allocation of IndexMemory from ACC to allocation with global pool to avoid mallocs in "signal" code which can lead to heartbeat problems

pushed to 6.3.31 and 7.0.11

Dpcumented bugfix in the NDB-6.3.31 and 7.0.11 changelogs as follows:

        During Start Phases 1 and 2, the STATUS command sometimes 
        (falsely) returned 'Not Connected' for data nodes running 
        ndbmtd.

Closed.