Bug #49476 ndb_mgmd started with --nowait-nodes hangs on cluster shutdown
Submitted: 5 Dec 2009 18:40 Modified: 19 Dec 2010 16:02
Reporter: Jon Stephens Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-7.0 OS:Linux (opensuse11.1/x64)
Assigned to: CPU Architecture:Any
Tags: --nowait-nodes, ndb, ndb_mgmd, ndb-7.0.10-bzr-20091205, shutdown

[5 Dec 2009 18:40] Jon Stephens
Description:
Management server started with --nowait-nodes option appears to hang on shutdown.

Hit this while documenting fix for BUG#48669. Only an issue in ndb-7.0.10.

See 'How To Repeat'.

How to repeat:
Configure a cluster with 2 MGM nodes, e.g.

[ndbd default]
DataMemory = 100M
IndexMemory = 100M
NoOfReplicas = 2
DataDir = /home/jon/cluster-data

[ndbd]
NodeId = 1
HostName = tonfisk

[ndbd]
NodeId = 2
HostName = grindval

[ndbd]
NodeId = 3
HostName = tonfisk
NodeGroup = 1

[ndbd]
NodeId = 4
HostName = grindval
NodeGroup = 1

[mgm]
HostName = tonfisk
NodeId = 10

[mgm]
HostName = grindval
NodeId = 11

[api]
NodeId = 20
HostName = tonfisk

[api]
NodeId = 21
HostName = grindval

Start MGM on 'tonfisk' using 

ndb_mgmd --nowait-nodes=10 --ndb-nodeid=11 --initial -f ./config.ini

Start data/api nodes with 

ndbd -c tonfisk,grindval --initial
mysqld_safe --ndbcluster --ndb-connectstring=tonfisk,grindval &

Do not start second MGM.

Result: Data nodes start/connect normally, mysqld's ditto. 

Run a few queries in mysql clients, everything appears to work normally.

On flundra, run 

ndb_mgm -e shutdown 

from shell or start ndb_mgm client and run SHUTDOWN command in it.

Result: Data nodes shut down without problems, but ndb_mgm never returns/exits (waited 4-5 minutes). Must be kill -9'ed or ^C'ed. MGM process is still running, must be kill -9'ed to stop it.

Can't tell if hang is really in MGM or ndb_mgm client.

Suggested fix:
ndb_mgm -e shutdown / ndb_mgm> shutdown should stop MGM in a timely fashion, even if 2 are configured but only one is ever started, using --nowait-nodes to bypass config check.

Or tell me what I'm doing wrong. ;)
[19 Nov 2010 16:02] Sveta Smirnova
Jon,

can you repeat it with latest version?
[20 Dec 2010 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".