MySQL Bugs: #28292: ndbcluster fails to initialize when my.cnf defines ndb-nodeid

Bug #28292	ndbcluster fails to initialize when my.cnf defines ndb-nodeid
Submitted:	8 May 2007 0:30	Modified:	22 Mar 2008 10:55
Reporter:	Gerry Reno	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	5.1.17-beta, 5.1.22, 5.1.24-BK	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
I built and installed 5.1.17-beta from source using the following configure line:
env LD_RUN_PATH=/opt/lampp/lib LD_LIBRARY_PATH="/opt/lampp/lib" CFLAGS="-O6
-I/opt/lampp/include -L/opt/lampp/lib -Wl,--rpath -Wl,/opt/lampp/lib"
./configure --prefix=/opt/lampp --enable-assembler --enable-local-infile
--with-mysqld-user=nobody
--with-unix-socket-path=/opt/lampp/var/mysql/mysql.sock
--with-extra-charsets=complex --libexecdir=/opt/lampp/sbin
--sysconfdir=/opt/lampp/etc --datadir=/opt/lampp/share
--localstatedir=/opt/lampp/var/mysql --infodir=/opt/lampp/info
--includedir=/opt/lampp/include --mandir=/opt/lampp/man --with-innodb --with-vio
--with-ssl=/opt/lampp  --with-ssl-includes=/opt/lampp/include
--with-ssl-libs=/opt/lampp/lib --with-archive-storage-engine 
--with-federated-storage-engine --with-csv-storage-engine --with-ndbcluster
--with-bdb --enable-thread-safe-client || exit 1

When I try to start the server as an SQL Node with a particular nodeid in the my.cnf file, ndbcluster initialization fails.  Here is my.cnf:
[MYSQLD]                        
ndbcluster                           # run NDB storage engine
ndb-connectstring=192.168.1.40:1186  # location of management server
ndb-mgmd-host=192.168.1.40
ndb-nodeid=5
engine_condition_pushdown=1          # send WHERE statements to Data Nodes for evaluation
[MYSQL_CLUSTER]                 
ndb-connectstring=192.168.1.40:1186

Here is the error log output:
070507 14:55:32  mysqld started
070507 14:55:32  InnoDB: Started; log sequence number 0 43655
070507 14:55:32 [ERROR] Plugin 'ndbcluster' init function returned error.
070507 14:55:32 [ERROR] Plugin 'ndbcluster' registration as a STORAGE ENGINE failed.
070507 14:55:32 [ERROR] Failed to init plugins.
070507 14:55:32 [ERROR] Aborting

070507 14:55:32  InnoDB: Starting shutdown...
070507 14:55:34  InnoDB: Shutdown completed; log sequence number 0 43655
070507 14:55:34 [Note] /opt/lampp/sbin/mysqld: Shutdown complete

Could not initialize handle to management server: Illegal connect string : 192.168.1.40,nodeid=5,192.168.1.40,nodeid=5,192.168.1.40:1186
070507 14:55:34  mysqld ended

As soon as I remove the line:
ndb-nodeid=5
then ndbcluster initializes properly.

How to repeat:
See description.

Suggested fix:
unknown

can't reproduce, works fine for me ...
are you sure that node id 5 is configured as a [mysqld]
slot in the cluster configuration and that it is not
already allocated by some other mysqld process or
configured for a different host?

could you provide your config.ini file and the 
hostname/ip of the failing mysqld node?

Here is mgm output:

ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @192.168.1.40  (Version: 5.1.17, Nodegroup: 0)
id=3    @192.168.1.41  (Version: 5.1.17, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @192.168.1.40  (Version: 5.1.17)

[mysqld(API)]   2 node(s)
id=4 (not connected, accepting connect from any host)
id=5    @192.168.1.25  (Version: 5.1.17)

As you can see nodeid 5 is a mysqld (shown is the output when I start the server without specifying the ndb-nodeid).

Here is config.ini:

[TCP DEFAULT]     
portnumber=2202

#-- NDB_MGMD MANAGEMENT DAEMON:
[NDB_MGMD]                      
#hostname=192.168.1.25                      # Hostname or IP address of MGM node
#datadir=/opt/lampp/var/lib/mysql-cluster   # Directory for MGM node log files
hostname=192.168.1.40                      # Hostname or IP address of MGM node
datadir=/var/lib/mysql-cluster   # Directory for MGM node log files

#-- NDBD DATA NODE DEFAULTS:
[NDBD DEFAULT]    
NoOfReplicas=2    # Number of replicas
# GR: initial allocation: 600M out of 2GB total for MySQL data/indexes
DataMemory=500M            # How much memory to allocate for data storage  (default=80M)
IndexMemory=100M           # How much memory to allocate for index storage (default=18M)

#-- NDBD DATA NODES:
# one [NDBD] section per data node
[NDBD]                          
hostname=192.168.1.40           # Hostname or IP address
datadir=/usr/local/mysql/data   # Directory for this data node's data files

[NDBD]                          
hostname=192.168.1.41           # Hostname or IP address
datadir=/usr/local/mysql/data   # Directory for this data node's data files

#-- SQL (API) NODES:
[MYSQLD]
[MYSQLD]                        
hostname=192.168.1.25 

The ip of the failing node is 192.168.1.25.

i've now changed my test configuration to not use explicit
node IDs but i still can't reproduce your problem, the 
mysqld servers start up just fine for me 

could you add explicit ID= lines for all nodes, too, 
and see if it makes a difference in your setup?

I changed my.cnf to the following:

[MYSQLD]                        
ndbcluster                           # run NDB storage engine
ndb-connectstring=192.168.1.40:1186  # location of management server
ndb-mgmd-host=192.168.1.40
ndb-nodeid=5  #this caused the ndbcluster engine to not be loaded by the plugin - see log
engine_condition_pushdown=1          # send WHERE statements to Data Nodes for evaluation

[NDBD]
ndbd-host=192.168.1.40
ndb-nodeid=2
[NDBD]
ndbd-host=192.168.1.41
ndb-nodeid=3

[NDB_MGMD]
ndb-mgmd-host=192.168.1.40
ndb-nodeid=4

# Options for all cluster ndbd processes:
[MYSQL_CLUSTER]                 
ndb-connectstring=192.168.1.40:1186 

The server still fails to start, producing the exact same error as previously listed.

History of my installation:

Original installation:  5.0.37 with ndbcluster.
Upgrade installation:   5.1.17-beta with ndbcluster.

Ran mysql_upgrade script immediately after upgrade install.

Brought up mgmd and ndbd nodes.

Attempted to start SQL Node and that is where the failed init for ndbcluster happened whenever a sql nodeid is specified in my.cnf.

Hope this helps.

Please, try to repeat with a newer version, 5.1.22, and inform about the results.

Same thing happens to me:
[mysqld]
ndbcluster
ndb-connectstring=1.2.3.4;1.2.3.5
#ndb-nodeid=4

If I remove comment, I get message saying:
Could not initialize handle to management server: Illegal connect string : nodeid=4,nodeid=4,1.2.3.4;1.2.3.5

Management node has only sections like:

[MYSQLD]
Hostname = 1.2.3.4

No node ids, only hostname for every entry. Configured with 2*data,2*mgm,2*api

I'm running 5.1.22 rhel5 x64 packages from mysql site.

I also have this problem using version 5.1.24 from BK.  I compiled from source to run 64-bit on Mac 10.5.  Same set up and config with id's defined in config.ini.  

Sometimes I can connect but I mostly I get an error, so the problem seems intermittent.  When I get a successful connection, the data nodes acknowledge the mysql nodes right away.  When it get an error, it seems as though the data nodes cannot communicate with the mysql node.  This is seen in the mgmt cluster logs.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

Thank you for the report.

I was able to repeat described behavior with following layout:

$cat etc/ndb_mgmd.cfg 
...
[MYSQLD]
NodeId = 9

[MYSQLD]
$cat etc/ndb_node_api1.cfg 
[MYSQL_CLUSTER]
ndb-connectstring=127.0.0.1:35118

# The MySQL server
[mysqld]
ndbcluster
ndb-connectstring=127.0.0.1:35118
port=35117
socket=/tmp/mysql_ndb_api1.sock
datadir=/Users/apple/Desktop/cluster_demo/data3

log-error

$cat etc/ndb_node_api2.cfg 
[MYSQL_CLUSTER]
ndb-connectstring=127.0.0.1:35118

# The MySQL server
[mysqld]
ndbcluster
ndb-connectstring=127.0.0.1:35118
port=35119
socket=/tmp/mysql_ndb_api2.sock
datadir=/Users/apple/Desktop/cluster_demo/data4
ndb-nodeid=10

log-error

In this case API node without specified ndb-nodeid sometime gets nodeid = 10, so nodeid=10 can not be allocated to another node and leads to plugin 'ndbcluster' initialisation failure. In other cases API node without specified ndb-nodeid gets nodeid = 9 and second node starts successfully. So I close the report as "Not a Bug".

I'm not sure if that was the problem. Please reopen if not.
This happens also when *all* nodes have explicit id set. (also with hostnames) My full config:

--- config.ini ---
DataDir = /var/lib/mysql-cluster
NoOfReplicas = 2
DataMemory = 1024M
IndexMemory = 100M

[NDB_MGMD]
Id = 1
Hostname = A.B.C.D
[NDB_MGMD]
Id = 2
Hostname = A.B.C.E

[MYSQLD]
Id = 3
Hostname = A.B.C.F
[MYSQLD]
Id = 4
Hostname = A.B.C.G

[NDBD]
Id = 5
Hostname = A.B.C.H
[NDBD]
Id = 6
Hostname = A.B.C.I
--- ---

--- my.cnf ---
[mysqld]                        
ndbcluster
ndb-connectstring=A.B.C.D;A.B.C.E
#ndb-nodeid=3
skip-innodb
default-table-type=ndbcluster
--- ---
same on second one but with 'ndb-nodeid=4'

Works with 'ndb-nodeid' commented out, but when I uncomment it (doesn't matter if config.ini has 'Hostname:' part commented out, or not) api node stops with error reported before (bad connect string).

I believe this is different situation from what previous commenter tested.

Stanislaw,

thank you for the feedback. Do you meet the problem when ndb-nodeid set for both mysqld or only for one?

Actually I've just checked that it doesn't depend on server / connection at all. I've tried to start api node (mysqld --user=mysql) with configuration like before (exactly those addresses - I've tried using 1.2.3.4;1.2.3.5):

--->8---
[mysqld]                        
ndbcluster
ndb-connectstring=1.2.3.4;1.2.3.5
ndb-nodeid=3
skip-innodb
default-table-type=ndbcluster
--->8---

Result:
--->8---
080325 10:19:59 [Note] Plugin 'InnoDB' disabled by command line option
Could not initialize handle to management server: Illegal connect string : nodeid=3,nodeid=3,1.2.3.4;1.2.3.5
080325 10:19:59 [ERROR] Plugin 'ndbcluster' init function returned error.
080325 10:19:59 [ERROR] Plugin 'ndbcluster' registration as a STORAGE ENGINE failed.
080325 10:19:59 [ERROR] Unknown/unsupported table type: ndbcluster
080325 10:19:59 [ERROR] Aborting

080325 10:19:59 [Note] /usr/sbin/mysqld: Shutdown complete
--->8---

It has failed before it tried to send any mysql-cluster packet at all - there was no ndb communication (monitored tcpdump). Exactly the same result as with previous tries in 'real' cluster.