Bug #45495 multiple managment servers fail to start
Submitted: 15 Jun 2009 10:28 Modified: 28 Sep 2009 15:12
Reporter: Johan Andersson Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-7.0 OS:Any
Assigned to: Magnus Blåudd CPU Architecture:Any
Tags: 7.0.6

[15 Jun 2009 10:28] Johan Andersson
Description:
Management server on A and B.

[NDB_MGMD]
Id=1
Hostname=A
ArbitrationRank=1

[NDB_MGMD]
Id=2
Hostname=B
ArbitrationRank=1

* The management servers must start in the following order:
A,B  

* Starting them in the following order does not work:
B,A 

* Having B started and (re)starting A does not work

Thus for rolling restarts, both management servers needs to be taken down and then restarted.

It does not matter if --reload and --initial is used or not.

How to repeat:
A:
ndb_mgmd -f config.ini  --reload --initial 

B:
ndb_mgmd -f config.ini --reload --initial'

Then
stop A, start A
A fails to start.

Suggested fix:
-
[16 Jun 2009 8:30] Sveta Smirnova
Thank you for the report.

Verified as described.
[1 Sep 2009 11:04] Geert Vanderkelen
(using 7.0.7)
For rolling restarts, this works for me:
  http://blog.some-abstract-type.com/2009/09/configuration-change-and-rolling.html

[ndbd(NDB)]     2 node(s)
id=3    @10.100.9.8  (mysql-5.1.35 ndb-7.0.7, Nodegroup: 0, Master)
id=4    @10.100.9.9  (mysql-5.1.35 ndb-7.0.7, Nodegroup: 0)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.100.9.6  (mysql-5.1.35 ndb-7.0.7)
id=2    @10.100.9.7  (mysql-5.1.35 ndb-7.0.7)

I kill the ndb_mgmd A and I start it again (BTW, no need to specify config.ini):
shell> ndb_mgmd --configdir=/data2/users/geert/cluster/master/
2009-09-01 13:01:43 [MgmSrvr] INFO     -- NDB Cluster Management Server. mysql-5.1.35 ndb-7.0.7
2009-09-01 13:01:43 [MgmSrvr] INFO     -- Loaded config from '/data2/users/geert/cluster/master//ndb_1_config.bin.2'

I kill both of them, and start ndb_mgmd B, and then A and no problems doing this on both:
 shell> ndb_mgmd --configdir=/data2/users/geert/cluster/master/
[1 Sep 2009 11:08] Geert Vanderkelen
I think the problem is more the usage of --reload and --initial which can screw up the configs, like this:
 http://bugs.mysql.com/bug.php?id=46488

In a rolling restart operation when configuration changes, you reload 1 ndb_mgmd with the new config, and should not restart the other ndb_mgmd at all.
[23 Sep 2009 10:49] Magnus Blåudd
Have written a patch that allows a ndb_mgmd started with "--initial --reload" to copy the config from an already started ndb_mgmd(with confirmed config) if the configuration files on both sides are exactly the same.

Testing...
[25 Sep 2009 7:58] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/84585
[28 Sep 2009 12:31] Magnus Blåudd
Pushed to 7.0 and 7.1
[28 Sep 2009 15:12] Jon Stephens
Documented bugfix in the NDB 7.0.8 changelog as follows:

        Now, when started with --initial --reload, ndb_mgmd tries to
        copy the configuration of an existing ndb_mgmd with a confirmed
        configuration. This works only if the configuration files used
        by both management nodes are exactly the same.

Closed.
[30 Sep 2009 8:14] Bugs System
Pushed into 5.1.37-ndb-7.0.9 (revid:jonas@mysql.com-20090930075942-1q6asjcp0gaeynmj) (version source revid:magnus.blaudd@sun.com-20090925104247-ozlmf4vu1f3936am) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[30 Sep 2009 8:15] Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:jonas@mysql.com-20090930080049-1c8a8cio9qgvhq35) (version source revid:jonas@mysql.com-20090925143824-3i5kcvsf8v3yf79j) (merge vers: 5.1.35-ndb-7.1.0) (pib:11)