MySQL Bugs: #23739: Node commits suicide (Error 2303)

Bug #23739	Node commits suicide (Error 2303)
Submitted:	27 Oct 2006 21:36	Modified:	1 Dec 2006 12:01
Reporter:	Steve Wolf	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S1 (Critical)
Version:	5.0.24a	OS:	Linux (CentOS 4.4 x86_64)
Assigned to:		CPU Architecture:	Any
Tags:	GCP stop, killed, NDBCNTR

Description:
Four-node cluster died.  ndb_1_error.log contains:

Current byte-offset of file-pointer is: 568

Time: Friday 27 October 2006 - 19:19:20
Status: Temporary error, restart node
Message: System error, node killed during node restart by other node (Internal error, programming error or missing error message, please report a bug)
Error: 2303
Error data: Node 1 killed this node because GCP stop was detected
Error object: NDBCNTR (Line: 197) 0x0000000a
Program: ndbd
Pid: 5300
Trace: /usr/local/mysql/data/ndb_1_trace.log.1
Version: Version 5.0.24
***EOM***

The interesting thing is this node _is_ Node 1.  So it killed itself.  All the other nodes identically reported:

Current byte-offset of file-pointer is: 568

Time: Friday 27 October 2006 - 21:59:44
Status: Temporary error, restart node
Message: Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s) (Arbitration error)
Error: 2305
Error data: Arbitrator decided to shutdown this node
Error object: QMGR (Line: 4556) 0x0000000a
Program: ndbd
Pid: 5148
Trace: /usr/local/mysql/data/ndb_2_trace.log.1
Version: Version 5.0.24
***EOM***

I will attach the trace logs.

How to repeat:
Unknown

Suggested fix:
Unknown

This looks like

http://bugs.mysql.com/bug.php?id=20904

Compressed tar of log and trace files

Attachment: bug23739_logs.tar.gz (application/x-gzip, text), 180.82 KiB.

I did a search for "GCP stop" and didn't find the bug you reference -- thanks for providing it.  This could be the same, but on the other hand this could be related to the cluster getting close to full (in the management node log, it reports 90% full and then nodes start missing heartbeats).  Please have a look at the files I attached.  Thanks.

Now the cluster won't start up.  I got this on Node 2, which was acting as Master during the startup:

Time: Friday 27 October 2006 - 23:22:02
Status: Unknown
Message: No message slogan found (please report a bug if you get this error code) (Unknown)
Error: 0
Error data: We(2) have been declared dead by 1 reason: Hearbeat failure(4)
Error object: QMGR (Line: 2840) 0x0000000a
Program: ndbd
Pid: 5398
Trace: /usr/local/mysql/data/ndb_2_trace.log.3
Version: Version 5.0.24
***EOM***

Is this related, or should I open another bug report?

Please, try to repeat with a newer version, 5.0.27. In case of the same problem, please, open a new reports and mention this one there (if data uploaded are still relevant). This one will be closed as likely duplicate of bug #20904 then.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".