Bug #47818 ndb_mgm reports "not connected" for ndbmtd nodes in phase 1 or phase2
Submitted: 5 Oct 2009 0:08 Modified: 27 Jan 2010 7:39
Reporter: John David Duncan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.3 OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any
Tags: 7.0.7

[5 Oct 2009 0:08] John David Duncan
Description:
For some time during node restart -- some part of restart phases 1 and 2 -- an "all status" command from ndb_mgm will falsely report "not connected" 

How to repeat:
Start a cluster and run "all status" repeatedly
[5 Oct 2009 0:08] John David Duncan
sample screen capture during cluster restart

Attachment: mgm-status-bug.txt (text/plain), 6.46 KiB.

[30 Oct 2009 11:32] Geert Vanderkelen
See also bug#48380 (which was marked duplicate of this bug report)

Quote from Cluster Development team, Magnus Blaudd:
"After fixing bug#48301, it should be possible to return CONNECTED(which is actually a new status for API and MGM) when asking status for the ndbd.
I.e in the case when node is not connected to ClusterMgr with transporter yet but has
allocated node id."
[22 Jan 2010 11:55] Jonas Oreland
So I just tried to reproduce this on 7.0-bzr, wo/ succeeding.
Can you please retry, and be more specific on how you get this.
and if managing to reproduce, also attach config.ini and cluster log

Setting: Need feedback
[25 Jan 2010 14:32] Jonas Oreland
just managed to reproduce this in 6.3 by using sufficiently large IndexMemory.
proposed patch series solves problem by moving allocation to global memory
and having acc consume it's memory from their during sp0 (instead of doing actual
malloc self)
[25 Jan 2010 15:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/98066

3089 Jonas Oreland	2010-01-25
      ndb - bug#47818 - Move allocation of IndexMemory from ACC to allocation with global pool to avoid mallocs in "signal" code which can lead to heartbeat problems
[26 Jan 2010 8:44] Jonas Oreland
pushed to 6.3.31 and 7.0.11
[27 Jan 2010 7:39] Jon Stephens
Dpcumented bugfix in the NDB-6.3.31 and 7.0.11 changelogs as follows:

        During Start Phases 1 and 2, the STATUS command sometimes 
        (falsely) returned 'Not Connected' for data nodes running 
        ndbmtd.

Closed.