Description:
id=3 @192.168.xx.xxx (mysql-5.1.23 ndb-6.2.15, starting, Nodegroup: 0)
There are two mysql data nodes, 2 and 3. Two is always up. We stop 3 for taking backups, after which it is started again.
Offlate 3 does not start again, but stays in this state .
2010-11-16 11:47:54 [MgmSrvr] INFO -- Node 2: DICT: remove lock by failed node 3 for NodeRestart
2010-11-16 11:47:54 [MgmSrvr] INFO -- Node 2: DICT: lock bs: 0 ops: 0 poll: 0 cnt: 0 queue:
2010-11-16 11:47:54 [MgmSrvr] INFO -- Node 2: Started arbitrator node 1 [ticket=6f63000652245b08]
2010-11-16 11:47:58 [MgmSrvr] INFO -- Node 2: Communication to Node 3 opened
2010-11-16 11:48:27 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 reserved for ip 192.168.20.101, m_reserved_nodes 000000000000000000000000000000000000000000000000000000000000001a.
2010-11-16 11:48:28 [MgmSrvr] INFO -- Node 1: Node 3 Connected
2010-11-16 11:48:28 [MgmSrvr] INFO -- Node 2: Node 3 Connected
2010-11-16 11:48:28 [MgmSrvr] INFO -- Node 2: Node 3: API mysql-5.1.23 ndb-6.2.15
2010-11-16 11:48:28 [MgmSrvr] INFO -- Node 3: Start phase 2 completed (node restart)
2010-11-16 11:48:28 [MgmSrvr] INFO -- Node 2: DICT: locked by node 3 for NodeRestart
2010-11-16 11:48:28 [MgmSrvr] INFO -- Node 2: DICT: lock bs: 4 ops: 0 poll: 0 cnt: 0 queue: 3L
2010-11-16 11:48:28 [MgmSrvr] INFO -- Node 3: Start phase 3 completed (node restart)
2010-11-16 11:48:28 [MgmSrvr] INFO -- Node 3: Start phase 4 completed (node restart)
2010-11-16 11:48:28 [MgmSrvr] INFO -- Node 3: Receive arbitrator node 1 [ticket=6f63000652245b08]
2010-11-16 11:48:28 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 freed, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000000012.
2010-11-16 11:48:36 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 reserved for ip 192.168.20.101, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000000032.
2010-11-16 11:48:36 [MgmSrvr] INFO -- Node 5: mysqld --server-id=0
2010-11-16 11:48:36 [MgmSrvr] INFO -- Node 2: Node 5 Connected
2010-11-16 11:48:36 [MgmSrvr] INFO -- Node 2: Node 5: API mysql-5.1.23 ndb-6.2.15
2010-11-16 12:32:52 [MgmSrvr] ALERT -- Node 2: Node 5 Disconnected
2010-11-16 12:32:52 [MgmSrvr] INFO -- Node 2: Communication to Node 5 closed
2010-11-16 12:32:52 [MgmSrvr] INFO -- Node 3: Communication to Node 5 closed
2010-11-16 12:32:53 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 freed, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000000012.
2010-11-16 12:32:56 [MgmSrvr] INFO -- Node 2: Communication to Node 5 opened
2010-11-16 12:33:11 [MgmSrvr] ALERT -- Node 3: Forced node shutdown completed. Initiated by signal 9.
2010-11-16 12:33:11 [MgmSrvr] ALERT -- Node 2: Node 3 Disconnected
2010-11-16 12:33:11 [MgmSrvr] INFO -- Node 2: Communication to Node 3 closed
2010-11-16 12:33:11 [MgmSrvr] ALERT -- Node 2: Network partitioning - arbitration required
2010-11-16 12:33:11 [MgmSrvr] INFO -- Node 2: President restarts arbitration thread [state=7]
2010-11-16 12:33:11 [MgmSrvr] ALERT -- Node 1: Node 3 Disconnected
2010-11-16 12:33:11 [MgmSrvr] ALERT -- Node 2: Arbitration won - positive reply from node 1
2010-11-16 12:33:11 [MgmSrvr] INFO -- Node 2: DICT: remove lock by failed node 3 for NodeRestart
2010-11-16 12:33:11 [MgmSrvr] INFO -- Node 2: DICT: lock bs: 0 ops: 0 poll: 0 cnt: 0 queue:
2010-11-16 12:33:11 [MgmSrvr] INFO -- Node 2: Started arbitrator node 1 [ticket=6f630007524dd0b6]
2010-11-16 12:33:15 [MgmSrvr] INFO -- Node 2: Communication to Node 3 opened
2010-11-16 12:33:38 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 reserved for ip 192.168.20.101, m_reserved_nodes 000000000000000000000000000000000000000000000000000000000000001a.
2010-11-16 12:33:39 [MgmSrvr] INFO -- Node 1: Node 3 Connected
2010-11-16 12:33:39 [MgmSrvr] INFO -- Node 2: Node 3 Connected
2010-11-16 12:33:39 [MgmSrvr] INFO -- Node 2: Node 3: API mysql-5.1.23 ndb-6.2.15
2010-11-16 12:33:39 [MgmSrvr] INFO -- Node 3: Start phase 2 completed (node restart)
2010-11-16 12:33:39 [MgmSrvr] INFO -- Node 2: DICT: locked by node 3 for NodeRestart
2010-11-16 12:33:39 [MgmSrvr] INFO -- Node 2: DICT: lock bs: 4 ops: 0 poll: 0 cnt: 0 queue: 3L
2010-11-16 12:33:39 [MgmSrvr] INFO -- Node 3: Start phase 3 completed (node restart)
2010-11-16 12:33:39 [MgmSrvr] INFO -- Node 3: Start phase 4 completed (node restart)
2010-11-16 12:33:39 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 freed, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000000012.
2010-11-16 12:33:40 [MgmSrvr] INFO -- Node 3: Receive arbitrator node 1 [ticket=6f630007524dd0b6]
2010-11-16 12:33:47 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 reserved for ip 192.168.20.101, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000000032.
2010-11-16 12:33:47 [MgmSrvr] INFO -- Node 5: mysqld --server-id=0
2010-11-16 12:33:48 [MgmSrvr] INFO -- Node 2: Node 5 Connected
2010-11-16 12:33:48 [MgmSrvr] INFO -- Node 2: Node 5: API mysql-5.1.23 ndb-6.2.15
# tail -n 50 ndb_3_out.log
2010-11-16 11:24:33 [ndbd] INFO -- Configuration fetched at 192.168.10.150 port 1186
2010-11-16 11:24:33 [ndbd] INFO -- Start initiated (mysql-5.1.23 ndb-6.2.15)
2010-11-16 11:24:33 [ndbd] INFO -- Ndbd_mem_manager::init(1) min: 80Mb initial: 100Mb
Adding 100Mb to ZONE_LO (1,3199)
WOPool::init(61, 9)
RWPool::init(22, 13)
RWPool::init(42, 18)
RWPool::init(62, 13)
RWPool::init(c2, 18)
RWPool::init(e2, 15)
WOPool::init(41, 8)
RWPool::init(82, 12)
RWPool::init(a2, 51)
WOPool::init(21, 6)
2010-11-16 11:44:48 [ndbd] ALERT -- Node 3: Forced node shutdown completed. Initiated by signal 9.
2010-11-16 11:45:22 [ndbd] INFO -- Angel pid: 13131 ndb pid: 13132
2010-11-16 11:45:22 [ndbd] INFO -- NDB Cluster -- DB node 3
2010-11-16 11:45:22 [ndbd] INFO -- mysql-5.1.23 ndb-6.2.15 --
2010-11-16 11:45:22 [ndbd] INFO -- Configuration fetched at 192.168.10.150 port 1186
2010-11-16 11:45:22 [ndbd] INFO -- Start initiated (mysql-5.1.23 ndb-6.2.15)
2010-11-16 11:45:22 [ndbd] INFO -- Ndbd_mem_manager::init(1) min: 80Mb initial: 100Mb
Adding 100Mb to ZONE_LO (1,3199)
WOPool::init(61, 9)
RWPool::init(22, 13)
RWPool::init(42, 18)
RWPool::init(62, 13)
RWPool::init(c2, 18)
RWPool::init(e2, 15)
WOPool::init(41, 8)
RWPool::init(82, 12)
RWPool::init(a2, 51)
WOPool::init(21, 6)
2010-11-16 12:30:05 [ndbd] ALERT -- Node 3: Forced node shutdown completed. Initiated by signal 9.
2010-11-16 12:30:33 [ndbd] INFO -- Angel pid: 13819 ndb pid: 13820
2010-11-16 12:30:33 [ndbd] INFO -- NDB Cluster -- DB node 3
2010-11-16 12:30:33 [ndbd] INFO -- mysql-5.1.23 ndb-6.2.15 --
2010-11-16 12:30:33 [ndbd] INFO -- Configuration fetched at 192.168.10.150 port 1186
2010-11-16 12:30:33 [ndbd] INFO -- Start initiated (mysql-5.1.23 ndb-6.2.15)
2010-11-16 12:30:33 [ndbd] INFO -- Ndbd_mem_manager::init(1) min: 80Mb initial: 100Mb
Adding 100Mb to ZONE_LO (1,3199)
WOPool::init(61, 9)
RWPool::init(22, 13)
RWPool::init(42, 18)
RWPool::init(62, 13)
RWPool::init(c2, 18)
RWPool::init(e2, 15)
WOPool::init(41, 8)
RWPool::init(82, 12)
RWPool::init(a2, 51)
WOPool::init(21, 6)
The ndb_error log is not getting updated..... It dates back to May...
ndb_trace log also is not getting updated...... Its dates back to May..
How to repeat:
Just restart the server, I have to shutdown the whole cluster and do a rolling restart to get things back to normal...
This means downtime :(