MySQL Bugs: #25825: RESTART for Management node stops the node, does not restart it.

Bug #25825	RESTART for Management node stops the node, does not restart it.
Submitted:	24 Jan 2007 13:12	Modified:	7 Apr 2014 12:44
Reporter:	Roland Bouman	Email Updates:
Status:	Unsupported	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	mysql-5.1	OS:	Linux (linux)
Assigned to:	Magnus Blåudd	CPU Architecture:	Any
Tags:	5.1.15bk, ndb_mgmd, restart

Description:
RESTART for a management node stops the node, it does not restart it.

please see 5756 - related symptom probably different cause as it has been so long.

How to repeat:
[13:43] <roland> hartmut, around?
[13:45] <hartmut> yes
[13:58] <roland> hi
[13:58] <roland> is your cluster still up?
[13:58] <hartmut> yes
[13:59] <roland> can you please test if a [mgmd id] RESTART
[13:59] <roland> restarts the management node
[13:59] <roland> you will be temporarily disconnected, 
[13:59] <roland> but I recall that if you wait a little, a SHOW command will reconnect and show the results again...
[14:01] <hartmut> doesn't work here
[14:02] <roland> damn
[14:02] <hartmut> kills the mgmd process
[14:02] <roland> not restart? 
[14:02] <roland> damn.
[14:02] <hartmut> no angel process for the ndb_mgmd, maybe thats why?
[14:02] <roland> then it is a bug
[14:02] <roland> If we say RESTART, it should either throw an errror, or restart
[14:02] <roland> not stop it
[14:03] <hartmut> right
[14:03] <roland> ok
[14:03] <roland> thanks for checking!
[14:03] <roland> can you mail me the output? 
[14:03] <roland> I'll file so right away, send you the #id
[14:04] <roland> so you can verify
[14:04] <hartmut> ndb_mgm> 1 restart
[14:04] <hartmut> Node 1 is being restarted
[14:04] <hartmut> ndb_mgm> show
[14:04] <hartmut> Warning, event thread startup failed, degraded printouts as result, errno=115
[14:04] <hartmut> Connected to Management Server at: localhost:1186
[14:04] <hartmut> Warning, event thread startup failed, degraded printouts as result, errno=115
[14:04] <hartmut> Connected to Management Server at: localhost:1186
[14:04] <hartmut> Could not get status
[14:04] <hartmut> *  1006: Illegal reply from server
[14:04] <hartmut> *        Probably disconnected
[14:04] <hartmut> show always returns the same messages, even after ndb_mgm client restart
[14:04] <hartmut> no ndb_mgmd in ps output
[14:04] <roland> and you don't see the process either? 
[14:05] <hartmut> right, only ndbd and mysqld in ps output left
[14:05] <roland> thanks

Suggested fix:
[14:02] <roland> If we say RESTART, it should either throw an errror, or restart
[14:02] <roland> not stop it
[14:03] <hartmut> right

Works on my 5.1 version on MacOSX

ndb_mgm> show
Connected to Management Server at: 127.0.0.1:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @127.0.0.1  (Version: 5.1.11, Nodegroup: 0, Master)
id=3    @127.0.0.1  (Version: 5.1.11, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @127.0.0.1  (Version: 5.1.11)

[mysqld(API)]   1 node(s)
id=4    @127.0.0.1  (Version: 5.1.11)

ndb_mgm> 1 restart
Restart failed.
*  1010: Management server not connected
*        

ndb_mgm> 
ndb_mgm> show
Connected to Management Server at: 127.0.0.1:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @127.0.0.1  (Version: 5.1.11, Nodegroup: 0, Master)
id=3    @127.0.0.1  (Version: 5.1.11, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @127.0.0.1  (Version: 5.1.11)

[mysqld(API)]   1 node(s)
id=4    @127.0.0.1  (Version: 5.1.11)

Possibly related:

I tried this on a 4-node cluster (nodes 1-4) with 2 MGM nodes (nodes 5 & 6): 

1. I started up an ndb_mgm client on each of the two MGM node hosts. 

2. From the client on machine hosting hosting node 6, I issued the command: 5 RESTART. 

3. The client from which I issued the command reported "Node 5 is being restarted" - I then issued a SHOW on bothmanagement clients, and the management client on the node 5 host immediately segfaulted.

Node 5 never stopped running.

Oops, forgot the configuration:

-- NDB Cluster -- Management Client --
ndb_mgm> show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=1    @192.168.0.103  (Version: 5.1.15, Nodegroup: 0, Master)
id=2    @192.168.0.176  (Version: 5.1.15, Nodegroup: 0)
id=3    @192.168.0.103  (Version: 5.1.15, Nodegroup: 1)
id=4    @192.168.0.176  (Version: 5.1.15, Nodegroup: 1)

[ndb_mgmd(MGM)] 2 node(s)
id=5    @192.168.0.103  (Version: 5.1.15)
id=6    @192.168.0.176  (Version: 5.1.15)

[mysqld(API)]   4 node(s)
id=7    @192.168.0.179  (Version: 5.1.15)
id=8    @192.168.0.176  (Version: 5.1.15)
id=9    @192.168.0.103  (Version: 5.1.15)
id=10   @192.168.0.112  (Version: 5.1.15)

i think this is/was the bug in non-block that stewart fixed last night...

People, it seems to work in 5.1.16 again. Anybody to confirm this?

still doesn't work on current 5.1 here

Thank you for taking the time to report a problem.  Unfortunately you are not using a current version of the product you reported a problem with -- the problem might already be fixed. Please download a new version from http://www.mysql.com/downloads/

If you are able to reproduce the bug with one of the latest versions, please change the version on this bug report to the version you tested and change the status back to "Open".  Again, thank you for your continued support of MySQL.