Description:
The cluster is not stable with large number of ndb storage nodes (8 nodes), all sharing the same replica of data (NoOfReplicas=1). When any of the storage nodes is killed, the cluster goes down with all storage nodes terminated.
The error message "Error handler shutting down system,Error handler shutdown completed - exiting" is printed to output log on all storage nodes.
The cluster works fine with 4 nodes.
Configuration and log files are listed below (except cluster log)
--------------------------------------------------------------------------------------
config.ini
--------------------------------------------------------------------------------------
[NDBD DEFAULT]
NoOfReplicas=1
DataMemory=250M
IndexMemory=250M
[MYSQLD DEFAULT]
[NDB_MGMD DEFAULT]
[TCP DEFAULT]
# Managment Server
[NDB_MGMD]
Id=1
HostName=10.130.50.31 # the IP of THIS SERVER
# Storage Engines
[NDBD]
Id=2
HostName=host2
DataDir=/var/lib/mysql/cluster
ServerPort=2203
[NDBD]
Id=3
HostName=host3
DataDir=/var/lib/mysql/cluster
ServerPort=2203
[NDBD]
Id=4
HostName=host4
DataDir=/var/lib/mysql/cluster
ServerPort=2203
[NDBD]
Id=5
HostName=host5
DataDir=/var/lib/mysql/cluster
ServerPort=2203
[NDBD]
Id=6
HostName=host6
DataDir=/var/lib/mysql/cluster
ServerPort=2203
[NDBD]
Id=7
HostName=host7
DataDir=/var/lib/mysql/cluster
ServerPort=2203
[NDBD]
Id=8
HostName=host8
DataDir=/var/lib/mysql/cluster
ServerPort=2203
[NDBD]
Id=9
HostName=host9
DataDir=/var/lib/mysql/cluster
ServerPort=2203
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
--------------------------------------------------------------------------------------
/var/lib/mysql/cluster/ndb_2_error.log
--------------------------------------------------------------------------------------
Date/Time: Monday 31 October 2005 - 16:03:42
Type of error: error
Message: Arbitrator shutdown
Fault ID: 2305
Problem data: Arbitrator decided to shutdown this node
Object of reference: QMGR (Line: 3795) 0x0000000a
ProgramName: /usr/sbin/ndbd
ProcessID: 4397
TraceFile: /var/lib/mysql/cluster/ndb_2_trace.log.10
Version 4.1.14
***EOM***
--------------------------------------------------------------------------------------
/var/lib/mysql/cluster/ndb_2_out.log
--------------------------------------------------------------------------------------
2005-10-31 15:57:30 [NDB] INFO -- Angel pid: 4049 ndb pid: 4050
2005-10-31 15:57:30 [NDB] INFO -- NDB Cluster -- DB node 2
2005-10-31 15:57:30 [NDB] INFO -- Version 4.1.14 --
2005-10-31 15:57:30 [NDB] INFO -- Configuration fetched at 10.133.55.51 port 1186
2005-10-31 16:01:51 [NDB] INFO -- Angel pid: 4396 ndb pid: 4397
2005-10-31 16:01:51 [NDB] INFO -- NDB Cluster -- DB node 2
2005-10-31 16:01:51 [NDB] INFO -- Version 4.1.14 --
2005-10-31 16:01:51 [NDB] INFO -- Configuration fetched at 10.133.55.51 port 1186
Error handler shutting down system
Error handler shutdown completed - exiting
How to repeat:
Very easy to reproduce:
-- configure cluster with 8 nodes, set NoOfReplicas=1
-- start the cluster
-- kill ndbd on one of the nodes