Bug #56747 NDB node shuts down for unknow reason
Submitted: 13 Sep 2010 10:09 Modified: 16 Oct 2010 8:55
Reporter: raymond guo Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:mysql-cluster-gpl-noinstall-7.1.3-win32 OS:Microsoft Windows (2003 server)
Assigned to: CPU Architecture:Any
Tags: cluster

[13 Sep 2010 10:09] raymond guo
Description:
my cluster's NDB node was shutted down for several times.and the ndb logs are printed below:

2010-09-10 13:11:05 [ndbd] INFO     -- Watchdog: User time: 6406250  System time
: 43906250
2010-09-10 13:11:05 [ndbd] WARNING  -- Watchdog: Warning overslept 281 ms, expec
ted 100 ms.
WARNING: timerHandlingLab now: 12928569066645 sent: 12928569066567 diff: 78
2010-09-10 13:11:06 [ndbd] INFO     -- Watchdog: User time: 6406250  System time
: 43906250
2010-09-10 13:11:06 [ndbd] WARNING  -- Watchdog: Warning overslept 296 ms, expec
ted 100 ms.
WARNING: timerHandlingLab now: 12928569095895 sent: 12928569095817 diff: 78
2010-09-10 13:12:38 [ndbd] INFO     -- findNeighbours from: 1955 old (left: 6553
5 right: 65535) new (3 3)
2010-09-10 13:12:44 [ndbd] INFO     -- granting dict lock to 3
2010-09-10 13:12:44 [ndbd] INFO     -- clearing dict lock for 3
2010-09-10 13:12:45 [ndbd] INFO     -- granting dict lock to 3
prepare to handover bucket: 1
377173/0 (377172/4294967295) switchover complete bucket 1 state: 2handover
2010-09-10 13:12:50 [ndbd] INFO     -- clearing dict lock for 3
WARNING: timerHandlingLab now: 12928569170177 sent: 12928569170114 diff: 63
WARNING: timerHandlingLab now: 12928569178974 sent: 12928569178911 diff: 63
WARNING: timerHandlingLab now: 12928585357645 sent: 12928585357317 diff: 328
2010-09-10 17:42:37 [ndbd] INFO     -- Watchdog: User time: 6406250  System time
: 62031250
2010-09-10 17:42:37 [ndbd] WARNING  -- Watchdog: Warning overslept 406 ms, expec
ted 100 ms.
2010-09-10 17:42:37 [ndbd] INFO     -- findNeighbours from: 4419 old (left: 3 ri
ght: 3) new (65535 65535)
2010-09-10 17:42:38 [ndbd] INFO     -- Arbitrator decided to shutdown this node
2010-09-10 17:42:38 [ndbd] INFO     -- QMGR (Line: 5532) 0x00000002
error=2305
2010-09-10 17:42:38 [ndbd] INFO     -- Error handler shutting down system
2010-09-10 17:42:38 [ndbd] INFO     -- Error handler shutdown completed - exitin
g
2010-09-10 17:42:38 [ndbd] ALERT    -- Node 2: Forced node shutdown completed.

could anyone tell me the reason and the solution of this problem? thank you!! 

How to repeat:
the bug will repeat without any operation,after the cluster run for several hours.
[16 Sep 2010 8:55] Hartmut Holzgraefe
Hard to tell what's actually wrong here without seeing the full cluster logs.

My guess would be heartbeat failures in network communication between the nodes,
but without seeing the management nodes cluster log i can't verify that ...
[16 Oct 2010 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".