Bug #12580 Automatically assigned ServerPorts were wrong
Submitted: 15 Aug 2005 12:07 Modified: 26 Mar 2007 12:44
Reporter: Kai Voigt Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:4.1.13-max OS:Linux (Linux 64bit)
Assigned to: Stewart Smith CPU Architecture:Any

[15 Aug 2005 12:07] Kai Voigt
Description:
Cluster setup with 3 dual-CPU machines, 2 data nodes on each machines, i.e 6 data nodes in total.
All machines were connected by crossover cables to the other machines. And all machines were connected via a third interface to a LAN where the management node was attached to.

The config.ini was created straightforward. ndb_mgmd started happily, but only 5 out of 6 data nodes made a connection to the management node. The 6th data node complained about duplicate Serverport 2202.

How to repeat:
Setup a topology like above and use the following config.ini and remove the manual ServerPort lines. With those lines, all 6 data nodes can connect. Without those lines, only 5 data nodes make it into the cluster.

[ndbd default]
datadir=/usr/local/mysql/cluster
DataMemory=3000M
IndexMemory=512M
RedoBuffer=600M
MaxNoOfAttributes=5000
MaxNoOfTables=500
MaxNoOfOrderedIndexes=2000
MaxNoOfUniqueHashIndexes=1000
NoOfReplicas=2

[ndb_mgmd]
Id=31
datadir=/usr/local/mysql/cluster
Hostname=10.131.110.9

[ndbd]
Id=1
Hostname=10.131.110.1
ServerPort=2202                                                                 

[ndbd]
Id=2
Hostname=10.131.110.2
ServerPort=2202

[ndbd]
Id=3
Hostname=10.131.110.3
ServerPort=2202

[ndbd]
Id=4
Hostname=10.131.110.1
ServerPort=2203

[ndbd]
Id=5
Hostname=10.131.110.2
ServerPort=2203
                                                                                

[ndbd]
Id=6
Hostname=10.131.110.3
ServerPort=2203

[mysqld]
[mysqld]
[mysqld]
[mysqld]

# rote Verbindungen

[tcp]
NodeId1=1
NodeId2=3
Hostname1=192.168.98.1
Hostname2=192.168.98.3                                                          

[tcp]
NodeId1=1
NodeId2=6
Hostname1=192.168.98.1
Hostname2=192.168.98.3

[tcp]
NodeId1=4
NodeId2=3
Hostname1=192.168.98.1
Hostname2=192.168.98.3

[tcp]
NodeId1=4
NodeId2=6
Hostname1=192.168.98.1
Hostname2=192.168.98.3

# gruene Verbindungen
                                                                                
[tcp]
NodeId1=1
NodeId2=2
Hostname1=192.168.98.1
Hostname2=192.168.98.2

[tcp]
NodeId1=1
NodeId2=5
Hostname1=192.168.98.1
Hostname2=192.168.98.2

[tcp]
NodeId1=4
NodeId2=2
Hostname1=192.168.98.1
Hostname2=192.168.98.2                                                          

[tcp]
NodeId1=5
NodeId2=3
Hostname1=192.168.98.2
Hostname2=192.168.98.3

[tcp]
NodeId1=5
NodeId2=6
Hostname1=192.168.98.2
Hostname2=192.168.98.3

# interne Verbindungen

[tcp]
NodeId1=1
NodeId2=4
Hostname1=192.168.98.1
Hostname2=192.168.98.1
                                                                                
[tcp]
NodeId1=2
NodeId2=5
Hostname1=192.168.98.2
Hostname2=192.168.98.2

[tcp]
NodeId1=3
NodeId2=6
Hostname1=192.168.98.3
Hostname2=192.168.98.3                                                          

Suggested fix:
No idea. Maybe fix something in the code that computes the ServerPort values.
[24 Nov 2005 15:26] Hartmut Holzgraefe
Looks as if the automatic assignment doesn't work with multiple ndbd processes
on each host when the connections between the ndbd processes are specified
in [tcp] sections?
[24 Nov 2005 15:28] Hartmut Holzgraefe
It doesn't seem to be related to 3 machines only, the same effect was observed
on a 2 machine setup with 2 ndbd processes per machine and [tcp] specifications
with 5.0.15
[1 Dec 2005 7:54] Stewart Smith
the workaround is to remove the specification of ServerPort. then everything should work.

(realised that I hadn't mentioned this in the bug report - just on irc :)
[26 Jan 2006 9:55] Tomas Ulin
workaround exists
[21 Jun 2006 11:40] Stewart Smith
for 5.0+ yes.

for 4.1 you have to specify manually for each connection.

i don't think this is worth fixing.
[9 Nov 2006 11:11] Stewart Smith
In 5.0 and above, without the port specified, we get the Operating System to allocate free port numbers. This always yields correct results. specifying port numbers for transporters is depricated.

(discussed setting this to won't fix with martin, he agrees).
[24 Nov 2006 13:58] Geert Vanderkelen
Deprecation of the PortNumber is not an option.

This is very much useful option that should be available whatever other solution is in place. It should be possible to tell what port a node is listening on. This is important for firewalling.
[26 Mar 2007 12:44] Stewart Smith
So this is what I think we should do:
- depricate setting the base port number
- point out that firewall between cluster nodes is silly (private network and all) and all it does is introduce
- if port numbers are to be known, can be set statically per connection

I vote we just document the 2nd option above.

I don't propose fixing this rather obscure case for these reasons:
a) backwards compatibility (and online upgrade) testing is huge
b) not possible to test with current autotest framework (and forget about mysql-test-run.pl). In fact... not even possible with ndb_mgm_set_configuration() that's part of the add node patches.