MySQL Bugs: #28948: ndb nodes cannot be restarted.

Bug #28948	ndb nodes cannot be restarted.
Submitted:	7 Jun 2007 14:50	Modified:	20 Oct 2008 19:18
Reporter:	Tony Shadwick	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	5.1.19	OS:	FreeBSD (Version 6.2)
Assigned to:	Magnus Blåudd	CPU Architecture:	Any
Tags:	Broken, cluster, error 2308, ndb, ndbd, startphase 5

Description:
Using a 3 computer configuration, 1 as the management node, and the other two systems acting as both ndb storage nodes and as api nodes, the system comes up correctly.  If you attempt to restart a storage node, either via the management interface, or simply restarting the process on the remote system, it will fail to rejoin with the same error message:

manager# ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> show;
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @172.16.10.30  (Version: 5.1.19, Nodegroup: 0, Master)
id=3    @172.16.10.32  (Version: 5.1.19, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @172.16.10.31  (Version: 5.1.19)

[mysqld(API)]   2 node(s)
id=4    @172.16.10.32  (Version: 5.1.19)
id=5    @172.16.10.30  (Version: 5.1.19)

ndb_mgm> 3 restart
Shutting down nodes with "-n, no start" option, to subsequently start the nodes.Node 3: Node shutdown initiated
Node 3: Node shutdown completed, restarting, no start.
Node 3 is being restarted

ndb_mgm> Node 3: Forced node shutdown completed. Occured during startphase 5. Caused by error 2308: 'Another node failed during system restart, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'. - Unknown error code: Unknown result: Unknown error code

The only way to get node 3 to rejoin is to back out node 2, and do ndbd --initial all over again on BOTH nodes.  If you don't do --initial on node 2, the node 3 will never rejoin.  Attempting to just run ndb without flags results in this same error again in the managment client.

How to repeat:
1. Install databases/mysql51-server from ports.  I used the following in /etc/make.conf:

.if ${.CURDIR:M*/databases/mysql51-server}
BUILD_OPTIMIZED=yes
WITH_NDB=yes
BUILD_STATIC=yes
.endif

2. On the management node, use the following config.ini:

[NDBD DEFAULT]
NoOfReplicas=2

[NDB_MGMD]
hostname=172.16.10.31
datadir=/usr/local/etc/mysql-cluster

[TCP DEFAULT]

# Section for the storage nodes
[NDBD]
# IP address of the first storage node
HostName=172.16.10.30 # mysql1
DataDir= /usr/local/etc/mysql-cluster

[NDBD]
# IP address of the second storage node
HostName=172.16.10.32 # mysql2
DataDir= /usr/local/etc/mysql-cluster

# one [MYSQLD] per storage node
[MYSQLD]
[MYSQLD]

3. On the ndb/api nodes, use the following my.cnf:

[mysqld]
ndbcluster
ndb-connectstring=172.16.10.31

[mysql_cluster]
ndb-connectstring=172.16.10.31

4. Start the management daemon.  The command is slightly different from other *nix platforms:

/usr/local/libexec/ndb_mgmd -f /usr/local/etc/mysql-cluster/config.ini

(Use the actual path to config.ini)

5. On the ndb nodes, start the ndb storage daemon for the first time:

/usr/local/libexec/ndbd --defaults-extra-file=/usr/local/etc/my.cnf --initial

(the defaults file flag may be left off, depending upon the location of my.cnf)

6. Optionally, start the mysql service on the ndb/api nodes.  Add mysql_enable="YES" to /etc/rc.conf, then run /usr/local/etc/rc.d/mysql-server start.

7.  On the management node, enter the managment client:

/usr/local/bin/ndb_mgm

show;

8. Attempt to restart one of the ndb nodes.

Suggested fix:
I have absolutely no idea. :)  I'm not sure why it's failing to begin with to be honest.  I first stumbled across this after I wrote rc.d scripts to start and stop all of the cluster daemons, as the FreeBSD port will build the binaries, but doesn't include start/stop scripts.  I thought it was a problem with the method I was using to stop the daemons, and that it had to be done via the management client.  Turns out the error occurs no matter what method you use to restart the ndb node.

I forgot to include error logs from node 3:

Time: Thursday 7 June 2007 - 14:30:26
Status: Temporary error, restart node
Message: Another node failed during system restart, please investigate error(s) on other node(s) (Restart error)
Error: 2308
Error data: Node failure during restart
Error object: QMGR (Line: 3138) 0x0000000a
Program: /usr/local/libexec/ndbd
Pid: 3381
Trace: /usr/local/etc/mysql-cluster/ndb_3_trace.log.21
Version: Version 5.1.19 (beta)
***EOM***

From node1 (the management node):

out.log
NDB Cluster Management Server. Version 5.1.19 (beta)
Id: 1, Command port: 1186
asked to stop 3
asked to stop 3

cluster.log
2007-06-07 09:30:00 [MgmSrvr] INFO     -- Node 3: Node shutdown initiated
2007-06-07 09:30:10 [MgmSrvr] ALERT    -- Node 2: Node 3 Disconnected
2007-06-07 09:30:10 [MgmSrvr] INFO     -- Node 2: Communication to Node 3 closed2007-06-07 09:30:10 [MgmSrvr] ALERT    -- Node 2: Network partitioning - arbitration required
2007-06-07 09:30:10 [MgmSrvr] INFO     -- Node 2: President restarts arbitration thread [state=7]
2007-06-07 09:30:10 [MgmSrvr] ALERT    -- Node 1: Node 3 Disconnected
2007-06-07 09:30:10 [MgmSrvr] ALERT    -- Node 2: Arbitration won - positive reply from node 1
2007-06-07 09:30:10 [MgmSrvr] INFO     -- Node 2: DICT: lock bs: 0 ops: 0 poll: 0 cnt: 0 queue: 
2007-06-07 09:30:10 [MgmSrvr] INFO     -- Node 3: Node shutdown completed, restarting, no start.
2007-06-07 09:30:10 [MgmSrvr] INFO     -- Mgmt server state: nodeid 3 reserved for ip 172.16.10.32, m_reserved_nodes 000000000000002a.
2007-06-07 09:30:10 [MgmSrvr] INFO     -- Node 2: Started arbitrator node 1 [ticket=049a000206982bd3]
2007-06-07 09:30:11 [MgmSrvr] INFO     -- Node 1: Node 3 Connected
2007-06-07 09:30:12 [MgmSrvr] INFO     -- Node 3: Communication to Node 2 opened2007-06-07 09:30:12 [MgmSrvr] INFO     -- Node 3: Waiting 30 sec for nodes 0000000000000004 to connect, nodes [ all: 000000000000000c connected: 0000000000000008 no-wait: 0000000000000000 ]
2007-06-07 09:30:12 [MgmSrvr] INFO     -- Mgmt server state: nodeid 3 freed, m_reserved_nodes 0000000000000022.
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 2: Communication to Node 3 opened2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 2: Node 3 Connected
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 2: Node 3: API version 5.1.19
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 3: Node 2 Connected
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 3: CM_REGCONF president = 2, own Node = 3, our dynamic id = 3
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 3: Node 2: API version 5.1.19
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 3: Start phase 1 completed 
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 3: Start phase 2 completed (node restart)
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 2: DICT: locked by node 3 for NodeRestart
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 2: DICT: lock bs: 4 ops: 0 poll: 0 cnt: 0 queue: 3L
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 3: Start phase 3 completed (node restart)
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 3: Start phase 4 completed (node restart)
2007-06-07 09:30:14 [MgmSrvr] INFO     -- Node 3: Receive arbitrator node 1 [ticket=049a000206982bd3]
2007-06-07 09:30:19 [MgmSrvr] WARNING  -- Node 3: Node 2 missed heartbeat 2
2007-06-07 09:30:20 [MgmSrvr] WARNING  -- Node 3: Node 2 missed heartbeat 3
2007-06-07 09:30:21 [MgmSrvr] ALERT    -- Node 1: Node 3 Disconnected
2007-06-07 09:30:21 [MgmSrvr] ALERT    -- Node 1: Node 3 Disconnected
2007-06-07 09:30:22 [MgmSrvr] WARNING  -- Node 2: Node 4 missed heartbeat 2
2007-06-07 09:30:22 [MgmSrvr] ALERT    -- Node 2: Node 4 Disconnected
2007-06-07 09:30:22 [MgmSrvr] INFO     -- Node 2: Communication to Node 4 closed2007-06-07 09:30:23 [MgmSrvr] ALERT    -- Node 2: Node 3 Disconnected
2007-06-07 09:30:23 [MgmSrvr] INFO     -- Node 2: Communication to Node 3 closed2007-06-07 09:30:23 [MgmSrvr] ALERT    -- Node 2: Network partitioning - arbitration required
2007-06-07 09:30:23 [MgmSrvr] INFO     -- Node 2: President restarts arbitration thread [state=7]
2007-06-07 09:30:23 [MgmSrvr] ALERT    -- Node 2: Arbitration won - positive reply from node 1
2007-06-07 09:30:23 [MgmSrvr] INFO     -- Node 2: DICT: remove lock by failed node 3 for NodeRestart
2007-06-07 09:30:23 [MgmSrvr] INFO     -- Node 2: DICT: lock bs: 0 ops: 0 poll: 0 cnt: 0 queue: 
2007-06-07 09:30:23 [MgmSrvr] ALERT    -- Node 3: Forced node shutdown completed. Occured during startphase 5. Caused by error 2308: 'Another node failed during system restart, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'.
2007-06-07 09:30:23 [MgmSrvr] INFO     -- Node 2: Started arbitrator node 1 [ticket=049a000306985d69]
2007-06-07 09:30:26 [MgmSrvr] INFO     -- Node 2: Communication to Node 3 opened2007-06-07 09:30:26 [MgmSrvr] INFO     -- Node 2: Communication to Node 4 opened2007-06-07 09:30:26 [MgmSrvr] INFO     -- Node 2: Node 4 Connected
2007-06-07 09:30:27 [MgmSrvr] INFO     -- Node 2: Node 4: API version 5.1.19
2007-06-07 09:55:13 [MgmSrvr] INFO     -- Node 2: Local checkpoint 20 started. Keep GCI = 28883 oldest restorable GCI = 21
2007-06-07 10:51:36 [MgmSrvr] INFO     -- Node 2: Local checkpoint 21 started. Keep GCI = 28883 oldest restorable GCI = 21
2007-06-07 11:48:02 [MgmSrvr] INFO     -- Node 2: Local checkpoint 22 started. Keep GCI = 32084 oldest restorable GCI = 21
2007-06-07 12:44:24 [MgmSrvr] INFO     -- Node 2: Local checkpoint 23 started. Keep GCI = 33683 oldest restorable GCI = 21

Just updating the tags for this bug.

I've been continuing to (somewhat blindly) hack away at this problem.  Thinking that perhaps this is something similar to the situation with a RAID system, and there needs to be more than one node left available after a failure, I created a third system with the same configuration as the first two, and updated config.ini to reflect it.  I got the same behavior.

As as act of frustration, I went through and directly specified node id's in all of the my.cnf files, and when starting ndbd, I include the nodeid in my connect string.  An example of the my.cnf file I'm using is this:

[mysqld]
ndbcluster
ndb-connectstring="nodeid=6;172.16.10.31"
bind-address = 172.16.10.32

[mysql_cluster]
ndb-connectstring="nodeid=6;172.16.10.31"

When I run my initial ndbd, I run this command:

/usr/local/libexec/ndbd --initial --ndb-connectstring="nodeid=3;172.16.10.31"

I get all three boxes up, just as before, and then tried rebooting one. For the first time, it came back up, and rejoined the cluster as per normal.  I was stunned.  I tried rebooting a different one, and then I got the same error as before.

I vaguely recall in my googling of this there being a bug where nodeid's were being "stolen", thus causing these types of problems.  Perhaps the bugs are related?

Output from ndb_error_reporter

Attachment: ndb_error_report_20070608152712.tar.bz2 (application/octet-stream, text), 14 bytes.

Scratch that.  bug-data-28948.tar.bz2 uploaded to the ftp site.  I had problems getting correct output from ndb_error reporter.  Fixed.  tar simply did tar c $reportdir|bzip, rather than tar cf - $reportdir|bzip2 which is correct.

Not meaning to be impatient, but is anyone alive out there?  If I need to purchase a support contract or something I will, but I'm at wits' end here.  I have absolutely no idea what's wrong.  Everything I do at this point is merely random hacking without any sense of direction.

Output from mysqlbug

Attachment: mysqlbug.txt (application/x-download, text), 1.86 KiB.

Output from mysqlbug - first one was truncated.

Attachment: mysqlbug.txt (application/x-download, text), 7.55 KiB.

I've managed to further narrow this down.  A snippet from 172.16.10.30's error log:

Time: Monday 11 June 2007 - 13:11:38
Status: Temporary error, restart node
Message: Another node failed during system restart, please investigate error(s) on other node(s) (Restart error)
Error: 2308
Error data: Node 4 disconnected
Error object: QMGR (Line: 2648) 0x00000008
Program: /usr/local/libexec/ndbd
Pid: 43700
Trace: /usr/local/etc/mysql-cluster/ndb_2_trace.log.25
Version: Version 5.1.19 (beta)
***EOM***

It points to a trace file.  I'm attaching that trace to go with it.

Trace to go with error log snippet.

Attachment: bug_data-28948-ndb_2_trace.log.25.gz (application/x-gzip, text), 47.74 KiB.

For grins, I rolled all of the boxes back to 5.0.41, and loe and behold...they all work.  So this is a bug that was introduced in 5.1.

please look at the logs for the other node, the error log here is from a node that has shut down because the other one failed (during system restart... which is interesting)

I've attached all relevant logs.  I have since rolled back to 5.0.x, and the problem does no occur there.  You should have logs from all of my systems attached to this bug.

The cluster log snippet you posted earlier on shows that after the starting node finished start phase 4 and proceeded to phase 5 it failed to receive heartbeat signals from the started node. So from the node #3 point of view it is indeed the case that another node failed while this node was starting. It is a bit strange though that this is reported as "Another node failed during system restart" even though this is just a single node restart and not a complete system one.

Anyway, is there anything unusual going on on node 2 at that time? Especially swapping on node 2 during the synchronization with the starting node could explain this ...

Nothing unusual at all.  These were fresh OS installs, literally the only things running other than the kernel, and typical system processes, were the mysql processes.  These boxes are all P4, with 1.5GB of ram.  Basically a testing ground to see how ndb works.

Exactly same config, right down to the config files, works beautifully under 5.0, but under 5.1 you get the behavior you see here.

I guess it's also worth noting that I also had not yet put any data into the database yet.  This was as fresh as fresh gets. :\

Please, try to repeat with a newer version, 5.1.22, and inform about the results.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

I cannot. :(  I don't have the environment available to do that anymore.

Most likely reason for the 5.1 versions failure to start is the "missed heartbeats" during it's node restart. It could most likely have been fixed by increasing HeartbeatIntervalDbDb to a higher value. See http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-ndbd-definition.html#mysql-cluster-pa... ndbd-definition-heartbeatintervaldbdb for more info about that.

The 5.1 version is a little more demanding on the machines and that could be the reason for the difference seen from 5.0

Since the problem can't be reproduced with latest version of MySQL Clusterthe bug will now be closed. Please don't hesitate to file a new bug or ask in our forum if you should encounter this or similar problem again. Sorry for the slow response.