Bug #64158 No free node id found for mysqld(API).
Submitted: 29 Jan 2012 19:28 Modified: 30 Aug 2016 23:08
Reporter: Michael Prokopiv Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:MySQL-Cluster-gpl-management-7.1.18-1.el OS:Linux (Oracle Linux Release 6 Update 2 Boot iso image for x86_64 (64 bit))
Assigned to: Bogdan Kecman CPU Architecture:Any
Tags: already allocated, free node id

[29 Jan 2012 19:28] Michael Prokopiv
Description:
Can't connect SQL node to the MGM node, 
ndb_mgm> show 
Cluster Configuration 
--------------------- 
[ndbd(NDB)] 2 node(s) 
id=2 (not connected, accepting connect from 192.168.92.31) 
id=3 (not connected, accepting connect from 192.168.92.32) 

[ndb_mgmd(MGM)] 1 node(s) 
id=1 @192.168.92.11 (mysql-5.1.56 ndb-7.1.18) 

[mysqld(API)] 2 node(s) 
id=4 (not connected, accepting connect from 192.168.92.21) 
id=5 (not connected, accepting connect from any host) 

at MGM node getting error every ~5 seconds: 
[MgmtSrvr] WARNING -- Failed to allocate nodeid for API at 192.168.92.21. Returned eror: 'No free node id found for mysqld(API).' 

"--verbose" key for ndb_mgmd gave some more details into log file: 
2012-01-24 01:22:50 [MgmtSrvr] DEBUG -- Trying to allocate nodeid for 192.168.92.21(nodeid: 0, type: API) 
2012-01-24 01:22:50 [MgmtSrvr] DEBUG -- [0]: 4, '192.168.92.21', 1 
2012-01-24 01:22:50 [MgmtSrvr] DEBUG -- [1]: 5, '', 0 
2012-01-24 01:22:50 [MgmtSrvr] DEBUG -- Retrying allocation of nodeid 4... 
2012-01-24 01:22:53 [MgmtSrvr] DEBUG -- Retrying allocation of nodeid 4... - Repeated 23 times 
2012-01-24 01:22:53 [MgmtSrvr] DEBUG -- Retrying allocation of nodeid 4... - Repeated 5 times 
2012-01-24 01:22:53 [MgmtSrvr] DEBUG -- Failed to allocate nodeid 4 for API node in cluster (retried during 3074 milliseconds) 
2012-01-24 01:22:53 [MgmtSrvr] DEBUG -- Releasing local reservation for nodeid 4 
2012-01-24 01:22:53 [MgmtSrvr] WARNING -- Failed to allocate nodeid for API at 192.168.92.21. Returned eror: 'No free node id found for mysqld(API).'

OS: Oracle Linux Release 6 Update 2 Boot iso image for x86_64 (64 bit) 
MGM installed: 
MySQL-Cluster-gpl-extra-7.1.18-1.el6.x86_64.rpm 
MySQL-Cluster-gpl-management-7.1.18-1.el6.x86_64.rpm 
MySQL-Cluster-gpl-tools-7.1.18-1.el6.x86_64.rpm 

file {config.ini} 
[ndbd default] 
NoOfReplicas=2 
DataMemory=80M 
IndexMemory=18M 

[tcp default] 
portnumber=2202 

[ndb_mgmd] 
HostName=192.168.92.11 
datadir=/var/lib/mysql 

[ndbd] 
HostName=192.168.92.31 
datadir=/var/lib/mysql/data 

[ndbd] 
HostName=192.168.92.32 
datadir=/var/lib/mysql/data 

[mysqld] 
HostName=192.168.92.21 

[mysqld] 

API node installed: 
MySQL-Cluster-gpl-client-7.1.18-1.el6.x86_64.rpm 
MySQL-Cluster-gpl-server-7.1.18-1.el6.x86_64.rpm 

{my.cnf} 
[mysqld] 
ndbcluster 
ndb-connectstring=192.168.92.11 
datadir=/var/lib/mysql 
socket=/var/lib/mysql/mysql.sock 
user=mysql 
# Disabling symbolic-links is recommended to prevent assorted security risks 
#symbolic-links=0 

[mysqld_safe] 
log-error=/var/log/mysqld.log 
pid-file=/var/run/mysqld/mysqld.pid 

[mysql cluster] 
ndb-connectstring=192.168.92.11 

How to repeat:
Just connect two servers and apply configuration files from description.
[30 Jan 2012 18:03] Jonas Oreland
you need to start your data-nodes.
the node id allocation is performed there...

/Jonas
[31 Jan 2012 8:40] Michael Prokopiv
The real problem is in the difference between starting server from command line ("mysqld_safe&") and as system service ("/etc/init.d/mysql start"): 
http://forums.mysql.com/read.php?25,295364,295404#msg-295404
[31 Jan 2012 8:40] Michael Prokopiv
SQL node init file

Attachment: mysql (text/plain), 11.95 KiB.

[1 Feb 2012 20:48] Sveta Smirnova
Thank you for the feedback.

Where is located that cnf file which does not read mysqld server started using mysql script?
[2 Feb 2012 18:00] Michael Prokopiv
/etc/my.cnf
[2 Feb 2012 18:04] Michael Prokopiv
Compare between /etc/init.d/mysqld files:
--- mysql.orig    2012-02-02 01:22:15.405952008 +0400
+++ mysql    2012-02-02 01:24:34.856156547 +0400
@@ -106,7 +106,7 @@
 
 mode=$1    # start or stop
 shift
-other_args="$*"   # uncommon, but needed when called from an RPM upgrade action
+other_args="--ndb-connectstring=XXX $*"   # uncommon, but needed when called from an RPM upgrade action
            # Expected: "--skip-networking --skip-grant-tables"
            # They are not checked here, intentionally, as it is the resposibility
            # of the "spec" file author to give correct arguments only.

Taken from
http://sqlinfo.ru/forum/viewtopic.php?pid=30030#p30030
[3 Feb 2012 14:51] Michael Prokopiv
Applying changes in the /etc/init.d/mysqld doesn't cover the situation. :(
[14 Feb 2012 16:50] Sveta Smirnova
Thank you for the feedback.

I can not repeat described behavior: start script successfully reads ndb-related options in my environment. Have you tried to start mysqld without this script? Did it solve the problem? Please also send us full mysqld error log file.
[20 Feb 2012 8:09] Michael Prokopiv
mysqld start

Attachment: AN01.mysqldomain.err (text/plain), 1.03 KiB.

[20 Feb 2012 8:09] Michael Prokopiv
mysqld start log

Attachment: AN01.mysqldomain.err (text/plain), 1.03 KiB.

[20 Feb 2012 8:39] Michael Prokopiv
MgmtSrvr log

Attachment: ndb_1_cluster.log (application/octet-stream, text), 30.09 KiB.

[20 Feb 2012 17:56] Sveta Smirnova
Thank you for the feedback.

Error log contains "Tables not available after 30 seconds.  Consider increasing --ndb-wait-setup value" This means mysqld is connecting to Cluster, so this is not startup script or option handling issue. Try increasing --ndb-wait-setup value and check network between mysqld and Cluster.
[21 Feb 2012 7:21] Michael Prokopiv
Svetlana, thank you for your advice, but this is not helpful in current situation.
Situation is very simple:
- calling "/var/init.d/mysql start" gives us nothing, except open connection
- calling "mysqld_safe --ndb-connectstring=192.168.92.11&" gives us immediately connection and registration
So, why do you decide, that problem is in timeout of 30 seconds?

There can be some problems with security, configuration and something else, but really not in timeouts. 
In my opinion real mistake happens at SQLnode side, during initialization inside scripts. So, it'll be great to look at - which parameters are sending to the MGMnode in both cases. Do you know any possibility to debug it?
[22 Feb 2012 8:13] Michael Prokopiv
Situation exactly repeated with MySQLcluster 7.2.4. :(
[29 Feb 2012 14:13] Yoshiaki Tajika
I'm also facing with the same problem. I tried two versions to get same result.
mysql-cluster-gpl-7.0.29-linux-x86_64-glibc23.tar.gz
mysql-cluster-gpl-7.2.4-linux2.6-x86_64.tar.gz

I checked ndb_mgmd's log with verbose mode.

2012-02-29 23:05:44 [MgmtSrvr] DEBUG    -- 192.168.39.1:49375: Connected!
2012-02-29 23:05:44 [MgmtSrvr] DEBUG    -- Trying to allocate nodeid for 192.168.39.1(nodeid: 0, type: API)
2012-02-29 23:05:44 [MgmtSrvr] DEBUG    --  [0]: 31, '192.168.39.1', 1
2012-02-29 23:05:44 [MgmtSrvr] DEBUG    --  [1]: 32, '192.168.39.1', 1
2012-02-29 23:05:44 [MgmtSrvr] DEBUG    --  [2]: 33, '192.168.39.1', 1
2012-02-29 23:05:44 [MgmtSrvr] DEBUG    -- Retrying allocation of nodeid 31...
2012-02-29 23:05:45 [MgmtSrvr] DEBUG    -- Retrying allocation of nodeid 31... - Repeated 8 times
2012-02-29 23:05:45 [MgmtSrvr] INFO     -- Node 26: Initial start, waiting for 25 to connect,  nodes [ all: 25 and 26 connected: 26 no-wait:  ]
2012-02-29 23:05:45 [MgmtSrvr] DEBUG    -- Retrying allocation of nodeid 31...
2012-02-29 23:05:45 [MgmtSrvr] DEBUG    -- Retrying allocation of nodeid 31... - Repeated 3 times
2012-02-29 23:05:46 [MgmtSrvr] INFO     -- Node 25: Initial start, waiting for 26 to connect,  nodes [ all: 25 and 26 connected: 25 no-wait:  ]
2012-02-29 23:05:46 [MgmtSrvr] DEBUG    -- Retrying allocation of nodeid 31...
2012-02-29 23:05:47 [MgmtSrvr] DEBUG    -- Retrying allocation of nodeid 31... - Repeated 17 times
2012-02-29 23:05:47 [MgmtSrvr] DEBUG    -- Failed to allocate nodeid 31 for API node in cluster (retried during 3100 milliseconds)
2012-02-29 23:05:47 [MgmtSrvr] DEBUG    -- Releasing local reservation for nodeid 31
2012-02-29 23:05:47 [MgmtSrvr] WARNING  -- Failed to allocate nodeid for API at 192.168.39.1. Returned eror: 'No free node id found for mysqld(API).'

There are numbers [0] to [2], which are slots, 
as I put three [api] entries in config.ini as a try.

Node 25 and 26 are Data nodes, so we may ignore these lines.

I searched the source code for these debug messages.
This log shows that MgmtSrvr::try_alloc() called alloc_node_id_req() 
about 30 times with 100ms of interval, and finally timeout occurred. 
So next step is why alloc_node_id_req() continued to fail.
But alloc_node_id_req() is too complicated for me to understand. 
Developers' help is needed.
[1 Mar 2012 0:35] Yoshiaki Tajika
Additional info about Yoshiaki Tajika's environment.

192.168.39.1  MGM and SQL1
192.168.39.5  DATA1
192.168.39.6  DATA2

# cat config.ini
[ndb_mgmd]
NodeId: 1
hostname: 192.168.39.1
datadir: /usr/local/mysql/data

[ndbd default]
NoOfReplicas: 2
DataDir: /usr/local/mysql/data
DataMemory: 300M
IndexMemory: 100M

[ndbd]
NodeId: 25
hostname: 192.168.39.5

[ndbd]
NodeId: 26
hostname: 192.168.39.6

[api]
NodeId: 31
hostname: 192.168.39.1

[api]
NodeId: 32
hostname: 192.168.39.1

[api]
NodeId: 33
hostname: 192.168.39.1

# cat /etc/my.cnf
[mysqld]
ndbcluster
ndb_connectstring = 192.168.39.1
datadir=/usr/local/mysql/data
[1 Mar 2012 5:02] Yoshiaki Tajika
I figured out my problem. Data nodes have not yet started completely.

ndb_mgm> all status
Node 25: starting (Last completed phase 0) (mysql-5.5.19 ndb-7.2.4)
Node 26: starting (Last completed phase 0) (mysql-5.5.19 ndb-7.2.4)
Node 27: starting (Last completed phase 0) (mysql-5.5.19 ndb-7.2.4)
Node 28: starting (Last completed phase 0) (mysql-5.5.19 ndb-7.2.4)

I checked DATA node's log, and found that phase 0 completed,
but phase 1 had not started. 

# cat ndb_25_out.log
 : (snipped)
2012-03-01 12:59:52 [ndbd] INFO     -- Start phase 0 completed

Data nodes could certainly communicate with MGM, and MGM recognized them. 
But Data nodes could not communicate with each other. The cause was firewall.
When I temporarily disabled every DATA node's firewall,
everything started to work.

ndb_mgm> all status
Node 25: started (mysql-5.5.19 ndb-7.2.4)
Node 26: started (mysql-5.5.19 ndb-7.2.4)
Node 27: started (mysql-5.5.19 ndb-7.2.4)
Node 28: started (mysql-5.5.19 ndb-7.2.4)

ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=25   @192.168.39.5  (mysql-5.5.19 ndb-7.2.4, Nodegroup: 0, Master)
id=26   @192.168.39.6  (mysql-5.5.19 ndb-7.2.4, Nodegroup: 0)
id=27   @192.168.39.7  (mysql-5.5.19 ndb-7.2.4, Nodegroup: 1)
id=28   @192.168.39.8  (mysql-5.5.19 ndb-7.2.4, Nodegroup: 1)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @192.168.39.1  (mysql-5.5.19 ndb-7.2.4)

[mysqld(API)]   3 node(s)
id=31   @192.168.39.1  (mysql-5.5.19 ndb-7.2.4)
id=32 (not connected, accepting connect from 192.168.39.1)
id=33 (not connected, accepting connect from 192.168.39.1)

Michael Prokopiv, how is your case?
[1 Mar 2012 10:54] Michael Prokopiv
Hello!
At my enviroment firewall accepts all connections inside cluster net.
And in my situation, mysqld starting as service can't register at MGM server (connection established), but the same mysqld starting manually (mysqld_safe) connecting succesfully.
Michael
[22 Aug 2012 9:11] Joffrey MICHAIE
On 7.2.7 :  storage/ndb/src/mgmsrv/MgmtSrvr.cpp : 3752 
                          "Returned eror: '%s'",                                                                                                                                       

Seems incorrect  (one R missing at error ...)
Not sure if I need to open a new bug for this

BR,
Joffrey
[23 Aug 2012 17:49] Sveta Smirnova
Joffrey,

yes, please open separate bug report. This typo is not related to current one.
[30 Aug 2016 23:08] Bogdan Kecman
7.2.25, 7.4.12 - can't reproduce problem, mysqld starts properly