MySQL Bugs: #37607: MySQL CLuster (mysql-5.1.23 ndb-6.2.15) not Starting on CentOS 5

Bug #37607	MySQL CLuster (mysql-5.1.23 ndb-6.2.15) not Starting on CentOS 5
Submitted:	24 Jun 2008 13:21	Modified:	23 Nov 2008 3:57
Reporter:	Bhupinder Singh	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S1 (Critical)
Version:	mysql-5.1.23 ndb-6.2.15	OS:	Other (CentOS release 5 (Final))
Assigned to:	Assigned Account	CPU Architecture:	Any
Tags:	Centos 5, cluster, hang, ndb

Description:
Hi 

I have configured a Cluster with the following : 2 nodes with SQL + NDB, 1 Node with ndb_mgmd.

Here are the config files:
************************************
Management Node:
----------------
mysql@HVSDB:~/5.1.23/mgmt_data> cat mgmt_config.ini
# Example Ndbcluster storage engine config file.
#
[ndbd default]
NoOfReplicas= 2
DataMemory= 80M
IndexMemory= 24M
TimeBetweenWatchDogCheck= 30000
DataDir= /opt/mysql/5.1.23/data
MaxNoOfOrderedIndexes= 512

[ndb_mgmd default]
DataDir= /opt/mysql/5.1.23/mgmt_data

[ndb_mgmd]
Id=1
HostName= 172.16.15.212

[ndbd]
Id= 2
HostName= 172.16.15.110

[ndbd]
Id= 3
HostName= 172.16.15.111

[mysqld]
Id= 4

[mysqld]
Id= 5

# choose an unused port number
# in this configuration 63132, 63133, and 63134
# will be used
[tcp default]
PortNumber= 63132

******************************************

Data and SQL Node

---------------------

[mysqld]
ndbcluster
ndb-connectstring=172.16.15.212

[mysql_cluster]
ndb-connectstring=172.16.15.212

[client]
port=3306
socket=/tmp/mysql.sock

[mysqld]
port=3306
socket=/tmp/mysql.sock
datadir=/opt/mysql/5.1.23/data

*********************************

I start the cluster as follows:
1. Management Node
2. NDBD on Node 1, NDBD on Node 2
3. SQL on Node 1, SQL on Node 2

The following is the output from the ndb_mgm

ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @172.16.15.110  (mysql-5.1.23 ndb-6.2.15, starting, Nodegroup: 0)
id=3    @172.16.15.111  (mysql-5.1.23 ndb-6.2.15, starting, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1   (mysql-5.1.23 ndb-6.2.15)

[mysqld(API)]   2 node(s)
id=4 (not connected, accepting connect from any host)
id=5 (not connected, accepting connect from any host)

The cluster stays like this forever, 

The NDB log file on either data node read like this 
-------------------------------------------
[root@HVSDB1 data]# cat ndb_2_out.log
2008-06-24 18:28:16 [ndbd] INFO     -- Angel pid: 6306 ndb pid: 6307
2008-06-24 18:28:16 [ndbd] INFO     -- NDB Cluster -- DB node 2
2008-06-24 18:28:16 [ndbd] INFO     -- mysql-5.1.23 ndb-6.2.15 --
2008-06-24 18:28:16 [ndbd] INFO     -- Configuration fetched at 172.16.15.212 port 1186
2008-06-24 18:28:16 [ndbd] INFO     -- Start initiated (mysql-5.1.23 ndb-6.2.15)
2008-06-24 18:28:16 [ndbd] INFO     -- Ndbd_mem_manager::init(1) min: 80Mb initial: 100Mb
Adding 100Mb to ZONE_LO (1,3199)
WOPool::init(61, 9)
RWPool::init(22, 13)
RWPool::init(42, 18)
RWPool::init(62, 13)
RWPool::init(c2, 18)
RWPool::init(e2, 15)
WOPool::init(41, 8)
RWPool::init(82, 12)
RWPool::init(a2, 51)
WOPool::init(21, 6)

------------------------------------

The Management Node log file reads like this

-----------------------

mysql@HVSDB:~/5.1.23/mgmt_data> tail -50 ndb_1_cluster.log
2008-06-24 18:43:57 [MgmSrvr] INFO     -- Node 3: Initial start, waiting for 0000000000000004 to connect,  nodes [ all: 000000000000000c connected: 0000000000000008 no-wait: 0000000000000000 ]
2008-06-24 18:43:59 [MgmSrvr] INFO     -- Node 2: Initial start, waiting for 0000000000000008 to connect,  nodes [ all: 000000000000000c connected: 0000000000000004 no-wait: 0000000000000000 ]
2008-06-24 18:44:00 [MgmSrvr] INFO     -- Node 3: Initial start, waiting for 0000000000000004 to connect,  nodes [ all: 000000000000000c connected: 0000000000000008 no-wait: 0000000000000000 ]
2008-06-24 18:44:02 [MgmSrvr] INFO     -- Node 2: Initial start, waiting for 0000000000000008 to connect,  nodes [ all: 000000000000000c connected: 0000000000000004 no-wait: 0000000000000000 ]
2008-06-24 18:44:03 [MgmSrvr] INFO     -- Node 3: Initial start, waiting for 0000000000000004 to connect,  nodes [ all: 000000000000000c connected: 0000000000000008 no-wait: 0000000000000000 ]
2008-06-24 18:44:05 [MgmSrvr] INFO     -- Node 2: Initial start, waiting for 0000000000000008 to connect,  nodes [ all: 000000000000000c connected: 0000000000000004 no-wait: 0000000000000000 ]
2008-06-24 18:44:06 [MgmSrvr] INFO     -- Node 3: Initial start, waiting for 0000000000000004 to connect,  nodes [ all: 000000000000000c connected: 0000000000000008 no-wait: 0000000000000000 ]
2008-06-24 18:44:08 [MgmSrvr] INFO     -- Node 2: Initial start, waiting for 0000000000000008 to connect,  nodes [ all: 000000000000000c connected: 0000000000000004 no-wait: 0000000000000000 ]

------------------------------------

Could some please look into it and let me know. I would like to use CentOS 5 as the OS since the rest of my platform runs on it. Although I have tested the same config on CentOS 4.6 and it runs fine.

Thanks and regards
Bhupinder

How to repeat:
Here are the config files:
************************************
Management Node:
----------------
mysql@HVSDB:~/5.1.23/mgmt_data> cat mgmt_config.ini
# Example Ndbcluster storage engine config file.
#
[ndbd default]
NoOfReplicas= 2
DataMemory= 80M
IndexMemory= 24M
TimeBetweenWatchDogCheck= 30000
DataDir= /opt/mysql/5.1.23/data
MaxNoOfOrderedIndexes= 512

[ndb_mgmd default]
DataDir= /opt/mysql/5.1.23/mgmt_data

[ndb_mgmd]
Id=1
HostName= 172.16.15.212

[ndbd]
Id= 2
HostName= 172.16.15.110

[ndbd]
Id= 3
HostName= 172.16.15.111

[mysqld]
Id= 4

[mysqld]
Id= 5

# choose an unused port number
# in this configuration 63132, 63133, and 63134
# will be used
[tcp default]
PortNumber= 63132

******************************************

Data and SQL Node

---------------------

[mysqld]
ndbcluster
ndb-connectstring=172.16.15.212

[mysql_cluster]
ndb-connectstring=172.16.15.212

[client]
port=3306
socket=/tmp/mysql.sock

[mysqld]
port=3306
socket=/tmp/mysql.sock
datadir=/opt/mysql/5.1.23/data

*********************************

I start the cluster as follows:
1. Management Node
2. NDBD on Node 1, NDBD on Node 2
3. SQL on Node 1, SQL on Node 2

I am having the same problem, the documentation is poor and needs to document problem cases better. I can start ndb_mgmd without error, when I try to run mysql.server start on the clients I get an error:

Starting MySQL/etc/rc.d/init.d/mysql.server: line 159: kill: (process ID) - No such process

Anyone have any luck getting MySQL cluster working on CentOS 5.2, MySQL 5.0.51a-community

This is incredibly frustrating, I can't believe the dearth of documentation and step by step instructions on this. The instructions that DO exist, ask you to wget pages that return 404s! Kind of a deal-breaker there, HINT: you can't wget a page that doesn't exist. I think CentOS/Red Hat Enterprise is a large enough distribution that an updated set of step-by-step instructions that have been tested on CentOS 5 should be developed and put up on mysql.com. Again, the current page:

http://dev.mysql.com/tech-resources/articles/mysql-cluster-for-two-servers.html

is for MySQL 4, it would be great to see a page for MySQL 5 with actual working instructions. I will post here if I can solve this problem but honestly this stuff should just "work", set up the config files and go for it styles. ANYONE have suggestions on this? Thanks!!!

-- Kev

Well now I had it going under Ubuntu 8.04. I had this;

ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]	2 node(s)
id=2	@192.168.1.152  (mysql-5.1.23 ndb-6.2.15, Nodegroup: 0, Master)
id=3	@192.168.1.153  (mysql-5.1.23 ndb-6.2.15, Nodegroup: 0)

[ndb_mgmd(MGM)]	1 node(s)
id=1	@192.168.1.151  (mysql-5.1.23 ndb-6.2.15)

[mysqld(API)]	2 node(s)
id=4	@192.168.1.153  (mysql-5.1.23 ndb-6.2.15)
id=5	@192.168.1.152  (mysql-5.1.23 ndb-6.2.15)

I was so happy. Then I ran a query on 153 and 152 didn't pick it up. This isn't working for CRAP! This is how MySQL Cluster should work:

1. Download source or RPM
2. Install source or RPM
3. Edit configuration file on Management Server to identify cluster servers
4. Edit configuration file on Cluster Servers to identify management server
5. Start management processes on Management Server
6. Start MySQL Cluster on cluster servers

That's it! It should be a six step process and the folks at MySQL ought to make it JUST THAT SIMPLE. What do you all think? Any chance of this happening?

OK I got this working under Fedora Core 9. I have a feeling it may also work in other distributions. I would like to see the following added to Alex Davies' tutorial:  

#1
STAGE 3: Configure the storage/SQL servers and start mysql

The tutorial does not make clear that when editing my.cnf, you need to keep whatever is created in the original my.cnf. Here is a my.cnf that worked for my 2 clustered DB servers:

[mysqld]
ndbcluster
# the IP of the MANAGMENT (THIRD) SERVER
ndb-connectstring=192.168.1.109
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
old_passwords=1

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

[mysql_cluster]
# the IP of the MANAGMENT (THIRD) SERVER
ndb-connectstring=192.168.1.109

I cannot say whether this will work for everyone, but, it worked for me.

#2

In order for me to make this work, I had to start mysqld on both servers:

su mysql
/usr/local/mysql/bin/mysqld &

And THEN issue the command
/etc/rc.d/init.d/mysql.server start

Frankly I would like to know if I did something wrong, or if it is the tutorial that needs to be updated. Wow.

USELESS!

OK I did a mysqldump of a mysql database (the one I want to use on the CLUSTER?!?!) and did a 
mysql <dbname> < db.sql
on server #1. Well guess what! IT DID NOT APPEAR ON SERVER #2? What is this CRAP? I am going to email Alex Davies.

HA a lot of ups and downs working on this problem. To share, if you get to this point, and are stuck like I am, here is what you have to do:

1. Issue a CREATE DATABASE command on MySQL console on both cluster servers. This should be documented!!!!! Or more to the point, it should just propagate but doing it on both servers is the way it is done now.

2. Do a search/replace changing MyISAM to NDBCLUSTER. Your CREATE TABLE commands in your SQL have to have ENGINE=NDBCLUSTER or it's not going to work. This should also be more clearly documented!!!

Now the only problem I have is, I ran out of memory. BUT, that is my problem. Anyway this was a tough one to install and I do wish there was a more streamlined way.

Databases are not automatically discovered on other mysql servers connected
to the cluster, the databases have to be created on all. This will be fixed
in future versions. Tables do not appear until they are queried or a "show
tables;" will make them appear.

Is this bug still a problem, is the issue on CentOS resolved?

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".