Bug #59213 Add node ndb_mgmd fails to parse NodeGroup=65536
Submitted: 29 Dec 2010 14:47 Modified: 21 Apr 2011 10:08
Reporter: Johan Andersson Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:7.1 bzr OS:Linux
Assigned to: Jonas Oreland CPU Architecture:Any
Tags: add node, ndb_mgmd, nodegroup

[29 Dec 2010 14:47] Johan Andersson
Description:
Management server fails to start with NodeGroup=65536, preventing pre-allocation of data nodes for easy add nodes (w/o rolling restart):

2010-12-29 14:51:48 [MgmtSrvr] ERROR    -- at line 248: Invalid nodegroup 65535 for node 5
2010-12-29 14:51:48 [MgmtSrvr] ERROR    -- Could not load configuration from '/etc/mysql/config.ini'

[NDBD]
NodeId=5
NodeGroup=65535
Hostname=xxxx

[NDBD]
NodeId=6
NodeGroup=65535
Hostname=yyyy

How to repeat:
Start a management server with the following:

[NDB_MGMD DEFAULT]
PortNumber=1186
Datadir=/data1/mysqlcluster/

[NDB_MGMD]
NodeId=1
Hostname=A

[NDBD DEFAULT]
NoOfReplicas=2
Datadir=/data1/mysqlcluster/

[NDBD]
NodeId=2
Hostname=B

[NDBD]
NodeId=3
Hostname=C

[NDBD]
NodeId=5
NodeGroup=65535
Hostname=D

[NDBD]
NodeId=6
NodeGroup=65535
Hostname=E

[MYSQLD]

[MYSQLD]

[MYSQLD]

[MYSQLD]

Suggested fix:
-
[30 Dec 2010 10:17] Sveta Smirnova
Thank you for the report.

With 65535 it fails for me, but successfully starts with 65536. Looks like 65536 works as advertised for 65535
[4 Feb 2011 15:39] Johan Andersson
Yes 65536 works fine, but the real problem is now that if i have:

[NDBD]
Id=2
Hostname=localhost

[NDBD]
Id=3
Hostname=localhost

[NDBD]
Id=4
Hostname=localhost
NodeGroup=65536

[NDBD]
Id=5
Hostname=localhost
NodeGroup=65536

is that the data nodes won't start:
2011-02-04 16:33:42 [MgmtSrvr] INFO     -- Node 2: Initial start, waiting for 4 and 5 to connect,  nodes [ all: 2, 3, 4 and 5 connected: 2 and 3 no-wait:  ]
2011-02-04 16:33:45 [MgmtSrvr] INFO     -- Node 2: Initial start, waiting for 4 and 5 to connect,  nodes [ all: 2, 3, 4 and 5 connected: 2 and 3 no-wait:  ]
2011-02-04 16:33:48 [MgmtSrvr] INFO     -- Node 2: Initial start, waiting for 4 and 5 to connect,  nodes [ all: 2, 3, 4 and 5 connected: 2 and 3 no-wait:  ]

I think it should not wait for data nodes that has NodeGroup=65536 as the nodes are not commissioned yet, and I can't use --nowait-nodes or --intial-start or what else because that just complicates automation.
[15 Apr 2011 13:59] Jonas Oreland
http://lists.mysql.com/commits/135541
[17 Apr 2011 10:08] Jonas Oreland
7.0.24 and 7.1.13
[17 Apr 2011 10:10] Jonas Oreland
DOCS: see commit for details.
also, 65535 in documentation should be changed to real value to use, i.e 65536
[21 Apr 2011 10:08] Jon Stephens
Documented fix in the NDB 7.0.24 and 7.1.13 changelogs; updated online add node procedure in docs; fixed max value of Nodegroup; added StartNoNodeGroupTimeout config param; see http://lists.mysql.com/commits/135898 for changelog entry, etc.