Bug #32311 API not connecting to ndb_mgm
Submitted: 13 Nov 2007 11:49 Modified: 14 Nov 2007 17:09
Reporter: Steven Lewis Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.1.22 OS:Linux (FC7)
Assigned to: CPU Architecture:Any

[13 Nov 2007 11:49] Steven Lewis
Description:
I have 5 machines 1 Management Node (10.10.2.1) and 4 Data/API Nodes (10.10.1.20-23) all data nodes connect fine and 3 API node, but API node on 10.10.2.20 will not connect.

Config File for all data/API nodes (10.10.1.20-23)

-----------------------------------
[root@dolph mysql]# cat /etc/my.cnf
[MYSQLD]
ndbcluster
ndb-connectstring=10.10.2.1

[mysql_cluster]
ndb-connectstring=10.10.2.1
-----------------------------------

this is the config for Management Server (10.10.2.1)

-----------------------------------
[root@moe mysql-cluster]# cat cluster.cnf
# Cluster Example Configuration
# 4 Data Nodes
# 1 Management Node
# 8 MySQLd Node

# Management Node
[ndb_mgmd]
Hostname=10.10.2.1
DataDir=/var/lib/mysql-cluster/

# Data Nodes, Defaults
[ndbd default]
NoOfReplicas=2
DataMemory=15G
IndexMemory=5G
TimeBetweenLocalCheckpoints = 20
NoOfFragmentLogFiles = 16
UndoIndexBuffer=100M
DataDir=/var/lib/mysql-cluster/
MaxNoOfAttributes = 2048
MaxNoOfOrderedIndexes = 1024
MaxNoOfUniqueHashIndexes = 1024
MaxNoOfTriggers = 2048
MaxNoOfConcurrentOperations = 1000000
MaxNoOfLocalOperations = 3000000

# Data Nodes
[ndbd]
Hostname=10.10.2.20

[ndbd]
Hostname=10.10.2.21

[ndbd]
Hostname=10.10.2.22

[ndbd]
Hostname=10.10.2.23

# MySQLd Node
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
-----------------------------------

and the output of show in ndb_mgm

------------------------------------
ndb_mgm> show
Connected to Management Server at: 127.0.0.1:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=2    @10.10.2.20  (Version: 5.1.22, Nodegroup: 0)
id=3    @10.10.2.21  (Version: 5.1.22, Nodegroup: 0)
id=4    @10.10.2.22  (Version: 5.1.22, Nodegroup: 1, Master)
id=5    @10.10.2.23  (Version: 5.1.22, Nodegroup: 1)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.10.2.1  (Version: 5.1.22)

[mysqld(API)]   8 node(s)
id=6    @10.10.2.21  (Version: 5.1.22)
id=7    @10.10.2.23  (Version: 5.1.22)
id=8    @10.10.2.22  (Version: 5.1.22)
id=9 (not connected, accepting connect from any host)
id=10 (not connected, accepting connect from any host)
id=11 (not connected, accepting connect from any host)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

ndb_mgm>
-----------------------------------

has you can see 10.10.1.20 API node will not connect I have tryed removeing that doing a fresh install you following rpms

-rw-r--r-- 1 root root 6.4M 2007-09-28 17:04 MySQL-client-community-5.1.22-0.rhel5.x86_64.rpm
-rw-r--r-- 1 root root 1.7M 2007-09-28 17:05 MySQL-clusterstorage-community-5.1.22-0.rhel5.x86_64.rpm
-rw-r--r-- 1 root root  20M 2007-09-28 17:23 MySQL-server-community-5.1.22-0.rhel5.x86_64.rpm

this setup works fine on the rest of the DataAPI nodes

this a snippet of the mysqld error log after 3 concurerent restarts, please note the same errors apper in the log of the other working Data/API nodes.

-----------------------------------
[root@dolph mysql]# cat dolph.(servername).com.err
071113 12:39:17 [Note] /usr/sbin/mysqld: Normal shutdown

071113 12:39:17 [Note] Event Scheduler: Purging the queue. 0 events
071113 12:39:17 [Note] Stopping Cluster Binlog
071113 12:39:17 [Note] Stopping Cluster Utility thread
071113 12:39:17  InnoDB: Starting shutdown...
071113 12:39:19  InnoDB: Shutdown completed; log sequence number 0 46409
071113 12:39:19 [Note] /usr/sbin/mysqld: Shutdown complete

071113 12:39:19 mysqld_safe mysqld from pid file /var/lib/mysql/dolph.singlemuslim.com.pid ended
071113 12:39:20 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
071113 12:39:20  InnoDB: Started; log sequence number 0 46409
071113 12:39:20 [Warning] NDB: server id set to zero will cause any other mysqld with bin log to log with wrong server id
071113 12:39:20 [Note] Starting MySQL Cluster Binlog Thread
071113 12:39:20 [Note] Event Scheduler: Loaded 0 events
071113 12:39:20 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.1.22-rc-community'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Edition (GPL)
071113 12:39:28 [Note] /usr/sbin/mysqld: Normal shutdown

071113 12:39:28 [Note] Event Scheduler: Purging the queue. 0 events
071113 12:39:28 [Note] Stopping Cluster Binlog
071113 12:39:28 [Note] Stopping Cluster Utility thread
071113 12:39:30  InnoDB: Starting shutdown...
071113 12:39:32  InnoDB: Shutdown completed; log sequence number 0 46409
071113 12:39:32 [Note] /usr/sbin/mysqld: Shutdown complete

071113 12:39:32 mysqld_safe mysqld from pid file /var/lib/mysql/dolph.singlemuslim.com.pid ended
071113 12:39:33 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
071113 12:39:33  InnoDB: Started; log sequence number 0 46409
071113 12:39:33 [Warning] NDB: server id set to zero will cause any other mysqld with bin log to log with wrong server id
071113 12:39:33 [Note] Starting MySQL Cluster Binlog Thread
071113 12:39:33 [Note] Event Scheduler: Loaded 0 events
071113 12:39:33 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.1.22-rc-community'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Edition (GPL)
071113 12:39:36 [Note] /usr/sbin/mysqld: Normal shutdown

071113 12:39:36 [Note] Event Scheduler: Purging the queue. 0 events
071113 12:39:36 [Note] Stopping Cluster Binlog
071113 12:39:36 [Note] Stopping Cluster Utility thread
071113 12:39:37  InnoDB: Starting shutdown...
071113 12:39:39  InnoDB: Shutdown completed; log sequence number 0 46409
071113 12:39:39 [Note] /usr/sbin/mysqld: Shutdown complete

071113 12:39:39 mysqld_safe mysqld from pid file /var/lib/mysql/dolph.singlemuslim.com.pid ended
071113 12:39:39 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
071113 12:39:39  InnoDB: Started; log sequence number 0 46409
071113 12:39:39 [Warning] NDB: server id set to zero will cause any other mysqld with bin log to log with wrong server id
071113 12:39:39 [Note] Starting MySQL Cluster Binlog Thread
071113 12:39:39 [Note] Event Scheduler: Loaded 0 events
071113 12:39:39 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.1.22-rc-community'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Edition (GPL)
------------------------------------------------

If you need any more info I will be happy to send it.

How to repeat:
The is a constent problem on one of my machines.
[13 Nov 2007 12:34] Hartmut Holzgraefe
We're sorry, but the bug system is not the appropriate forum for asking help on using MySQL products. Your problem is not the result of a bug.

Support on using our products is available both free in our forums at http://forums.mysql.com/ and for a reasonable fee direct from our skilled support engineers at http://www.mysql.com/support/

Thank you for your interest in MySQL.
[13 Nov 2007 12:38] Magnus BlÄudd
You can try to telnet to the ndb_mgmd's port from the machine that can't connect. The "ndb_mgmd" uses a plain text format, two easy ones are "get version" and "get status", both terminated by an extra newline.

So you can do:
$telnet 10.10.2.1 1186
get version
<newline>
version
id: 327958
major: 5
minor: 1
string: Version 5.1.22 (rc)

get status
<newline>
lots of info here...
[13 Nov 2007 12:46] Steven Lewis
thanks for you reply, that test worked ok and here are the results:

[root@dolph ~]# telnet 10.10.2.1 1186
Trying 10.10.2.1...
Connected to 10.10.2.1.
Escape character is '^]'.
get version

version
id: 327958
major: 5
minor: 1
string: Version 5.1.22 (rc)

get status

node status
nodes: 12
node.2.type: NDB
node.2.status: STARTED
node.2.version: 327958
node.2.startphase: 0
node.2.dynamic_id: 6
node.2.node_group: 0
node.2.connect_count: 10
node.2.address: 10.10.2.20
node.3.type: NDB
node.3.status: STARTED
node.3.version: 327958
node.3.startphase: 0
node.3.dynamic_id: 4
node.3.node_group: 0
node.3.connect_count: 0
node.3.address: 10.10.2.21
node.4.type: NDB
node.4.status: STARTED
node.4.version: 327958
node.4.startphase: 0
node.4.dynamic_id: 2
node.4.node_group: 1
node.4.connect_count: 0
node.4.address: 10.10.2.22
node.5.type: NDB
node.5.status: STARTED
node.5.version: 327958
node.5.startphase: 0
node.5.dynamic_id: 3
node.5.node_group: 1
node.5.connect_count: 0
node.5.address: 10.10.2.23
node.1.type: MGM
node.1.status: NO_CONTACT
node.1.version: 327958
node.1.startphase: 0
node.1.dynamic_id: 0
node.1.node_group: 0
node.1.connect_count: 0
node.1.address: 10.10.2.1
node.6.type: API
node.6.status: NO_CONTACT
node.6.version: 327958
node.6.startphase: 0
node.6.dynamic_id: 0
node.6.node_group: 0
node.6.connect_count: 0
node.6.address: 10.10.2.21
node.7.type: API
node.7.status: NO_CONTACT
node.7.version: 327958
node.7.startphase: 0
node.7.dynamic_id: 0
node.7.node_group: 0
node.7.connect_count: 0
node.7.address: 10.10.2.23
node.8.type: API
node.8.status: NO_CONTACT
node.8.version: 327958
node.8.startphase: 0
node.8.dynamic_id: 0
node.8.node_group: 0
node.8.connect_count: 0
node.8.address: 10.10.2.22
node.9.type: API
node.9.status: NO_CONTACT
node.9.version: 0
node.9.startphase: 0
node.9.dynamic_id: 0
node.9.node_group: 0
node.9.connect_count: 0
node.9.address: 0.0.0.0
node.10.type: API
node.10.status: NO_CONTACT
node.10.version: 0
node.10.startphase: 0
node.10.dynamic_id: 0
node.10.node_group: 0
node.10.connect_count: 0
node.10.address: 0.0.0.0
node.11.type: API
node.11.status: NO_CONTACT
node.11.version: 0
node.11.startphase: 0
node.11.dynamic_id: 0
node.11.node_group: 0
node.11.connect_count: 0
node.11.address: 0.0.0.0
node.12.type: API
node.12.status: NO_CONTACT
node.12.version: 0
node.12.startphase: 0
node.12.dynamic_id: 0
node.12.node_group: 0
node.12.connect_count: 0
node.12.address: 0.0.0.0
[14 Nov 2007 17:09] Hartmut Holzgraefe
Steven, Magnus, please move this to the cluster mailing list 
<cluster@lists.mysql.com> or the cluster forum at
http://forums.mysql.com/list.php?25

Unless we have proof of a real bug in a problem like this
discussion on it should be taken off the bug system.