Bug #40312 | Node restart ends with error 2303 as copyfrag failed | ||
---|---|---|---|
Submitted: | 24 Oct 2008 15:00 | Modified: | 16 Sep 2014 8:45 |
Reporter: | Michael Neubert | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | mysql-5.1-telco-6.3 | OS: | Linux |
Assigned to: | CPU Architecture: | Any | |
Tags: | copyfrag failed, error 2303, mysql-5.1.28 ndb-6.3.18-RC, node restart |
[24 Oct 2008 15:00]
Michael Neubert
[24 Oct 2008 15:03]
Michael Neubert
Trace log
Attachment: ndb_4_trace.log.rar (application/octet-stream, text), 69.15 KiB.
[26 Oct 2008 22:25]
Michael Neubert
Hello, with mysql-5.1.28 ndb-6.2.16-RC the problem seems to be the same, but we got a new kind of error code: 744 - Character string is invalid for given character set. Time: Sunday 26 October 2008 - 22:39:37 Status: Temporary error, restart node Message: System error, node killed during node restart by other node (Internal error, programming error or missing error message, please report a bug) Error: 2303 Error data: Killed by node 5 as copyfrag failed, error: 744 Error object: NDBCNTR (Line: 247) 0x0000000a Program: ndbd Pid: 6574 Trace: /var/log/mysql/ndb_2_trace.log.3 Version: mysql-5.1.28 ndb-6.2.16-RC ***EOM*** See also attached trace log. Best wishes Michael
[26 Oct 2008 22:36]
Michael Neubert
trace log
Attachment: ndb_2_trace.log.rar (application/octet-string, text), 93.57 KiB.
[3 Nov 2008 15:31]
Frazer Clement
Hi Michael, Thanks for your bug report. The files you sent are encoded in RAR format, and the free unrar tool I downloaded to decompress them does not work. Could you resend the files encoded with zip, gzip or uncompressed? Thanks, Frazer
[30 Nov 2008 5:03]
nelson mendaros
logs and trace logs
Attachment: ndb4logs.zip (application/zip, text), 168.97 KiB.
[30 Nov 2008 5:06]
nelson mendaros
Hi! We have the same problem restarting our node. I already restored the data using ndb_restore, shutdown the cluster after successfully restoring the data and then restart all the nodes. Below is the snapshots of the commands, messages and error log. ndb_mgm> show Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=3 @10.0.0.12 (mysql-5.1.27 ndb-6.3.17, starting, Nodegroup: 0, Master) id=4 @10.0.0.5 (mysql-5.1.27 ndb-6.3.17, not started) [ndb_mgmd(MGM)] 2 node(s) id=1 (not connected, accepting connect from 10.0.0.23) id=2 (mysql-5.1.27 ndb-6.3.17) [mysqld(API)] 3 node(s) id=5 (not connected, accepting connect from 10.0.0.25) id=6 (not connected, accepting connect from 10.0.0.11) id=7 (not connected, accepting connect from any host) ndb_mgm> Node 3: Started (version 6.3.17) ndb_mgm> 4 start Database node 4 is being started. ndb_mgm> Node 4: Start initiated (version 6.3.17) ndb_mgm> Node 4: Forced node shutdown completed. Occured during startphase 5. Caused by error 2303: 'System error, node killed during node restart by other node(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'. ============= [root@dtodb2 data]# tail ndb_4_error.log -n100 Current byte-offset of file-pointer is: 568 Time: Sunday 30 November 2008 - 12:37:12 Status: Temporary error, restart node Message: System error, node killed during node restart by other node (Internal error, programming error or missing error message, please report a bug) Error: 2303 Error data: Killed by node 4 as copyfrag failed, error: 1501 Error object: NDBCNTR (Line: 249) 0x0000000a Program: ndbd Pid: 6520 Trace: /var/lib/mysql/data/ndb_4_trace.log.1 Version: mysql-5.1.27 ndb-6.3.17-RC ***EOM*** I've previously attached the log and trace log file.
[8 Dec 2008 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[8 Dec 2008 12:06]
Frazer Clement
Changing to Analyzing as alternative logs have been provided.
[8 Dec 2008 13:47]
Frazer Clement
Nelson, Thanks for your trace file. The error number mentioned in the logs you sent (1501) is related to running out of Undo space for disk-based tables while performing the restart. It has already been noted that this problem is not well reported (Bug#30655). I suspect that you may need to increase the configured Undo space to successfully restart this cluster. I think that this is a separate issue to the original issue which sparked this bug. Could you please open a separate bug if you wish to pursue the issue further? Thanks, Frazer
[8 Dec 2008 13:49]
Frazer Clement
Michael, Could you resend/repost the log files in a format other than RAR (zip, gzip, tar etc.) My RAR reader cannot decompress the files you attached. Thanks, Frazer
[9 Jan 2009 18:18]
Michael Neubert
trace log as zip file
Attachment: ndb_4_trace.log.zip (application/x-zip-compressed, text), 121.07 KiB.
[12 Mar 2009 14:12]
Jonathan Miller
probably due to a resource shortage. Can you please try GA version 5.1.30_6.3.20
[13 Mar 2009 14:49]
Michael Neubert
Hello, I'm sorry, but we don't use the Cluster Storage Engine anymore for the mentionned project. So no further informations or tests are possible. Best wishes Michael
[1 Apr 2009 15:05]
paul miles
Trace and log files
Attachment: ndb.zip (application/x-zip-compressed, text), 146.45 KiB.
[1 Apr 2009 15:05]
paul miles
I'm running the following versions of Mysql Cluster : MySQL-Cluster-gpl-server-6.3.20-0.rhel5 MySQL-Cluster-gpl-management-6.3.20-0.rhel5 MySQL-Cluster-gpl-tools-6.3.20-0.rhel5 MySQL-Cluster-gpl-devel-6.3.20-0.rhel5 MySQL-Cluster-gpl-storage-6.3.20-0.rhel5 MySQL-Cluster-gpl-client-6.3.20-0.rhel5 I'm experiencing what looks like an identical issue to Michael. Please see above trace and log files. Thanks, Paul
[20 May 2009 10:48]
Nathan Thera
Hi I am also experiencing the copyfrag 744 error. I have one disconnected node that is unable to start. Passing the initial flag into ndbd results in the same error. The cluster is a 4 node, 2 replica setup running mysql-5.1.32 ndb-7.0.5-beta. Please see below for output from ndb_mgm and the node error log. The trace files and a snippet of the out log are attached. Nathan From ndb_mgm: Node 6: Forced node shutdown completed. Occured during startphase 5. Caused by error 2303: 'System error, node killed during node restart by other node(Internal error, programming error or missing error message, please report a bug). Temporary error, rest From the error log (ndb_6_error.log): Time: Wednesday 20 May 2009 - 04:44:06 Status: Temporary error, restart node Message: System error, node killed during node restart by other node (Internal error, programming error or missing error message, please report a bug) Error: 2303 Error data: Killed by node 6 as copyfrag failed, error: 744 Error object: NDBCNTR (Line: 260) 0x00000008 Program: ndbd Pid: 10113 Trace: /var/lib/mysql-cluster/ndb_6_trace.log.15 Version: mysql-5.1.32 ndb-7.0.5-beta ***EOM***
[20 May 2009 10:50]
Nathan Thera
Snippet of ndb_6_out.log
Attachment: ndb_6_out.log (application/octet-stream, text), 21.66 KiB.
[20 May 2009 10:53]
Nathan Thera
zip file containing the trace log for the node
Attachment: ndb_6_trace.log.zip (application/x-zip-compressed, text), 97.41 KiB.
[2 Aug 2009 0:18]
Nathan Thera
Error still happening in mysql-5.1.34 ndb-7.0.6. Logs below. Let me know if any additional logs are needed. Nathan Mgm: 2009-08-01 18:13:39 [MgmSrvr] ALERT -- Node 5: Forced node shutdown completed. Occured during startphase 5. Caused by error 2303: 'System error, node killed during node restart by other node(Internal error, programming error or missing error message, please report a bug). Temporary error, rest Error file: Time: Saturday 1 August 2009 - 18:12:29 Status: Temporary error, restart node Message: System error, node killed during node restart by other node (Internal error, programming error or missing error message, please report a bug) Error: 2303 Error data: Killed by node 5 as copyfrag failed, error: 744 Error object: NDBCNTR (Line: 260) 0x00000008 Program: ndbd Pid: 14476 Trace: /var/lib/mysql-cluster/ndb_5_trace.log.10 Version: mysql-5.1.34 ndb-7.0.6 ***EOM***
[2 Aug 2009 0:19]
Nathan Thera
Trace file attached to specific error
Attachment: ndb_5_trace.log.zip (application/x-zip-compressed, text), 92.16 KiB.
[2 Aug 2009 0:20]
Nathan Thera
snippet of the node out log
Attachment: ndb_5_out.log.snippet (text/plain), 85.23 KiB.
[23 Sep 2009 19:55]
Matthew Bilek
I get this error all of the time too. See bug #46985 for configuration information.
[24 Oct 2009 12:51]
Nathan Thera
Still in 7.0.8a. Just happened for a new machine running that was added to the cluster existing cluster (all running 7.0.8a)
Attachment: ndb_5_logs-7.0.8a.zip (application/x-zip-compressed, text), 91.78 KiB.
[16 Sep 2014 8:45]
Gustaf Thorslund
Old and pre-GA version so closing bug. If same error would show up with a newer version, please open a new bug.
[30 Aug 2018 14:44]
James Mo
Time: Thursday 30 August 2018 - 21:31:32 Status: Temporary error, restart node Message: System error, node killed during node restart by other node (Internal error, programming error or missing error message, please report a bug) Error: 2303 Error data: Killed by node 21 as copyfrag failed, error: 0 Error object: NDBCNTR (Line: 295) 0x00000004 Program: ndbmtd Pid: 167431 thr: 0 Version: mysql-5.7.21 ndb-7.5.9 Trace file name: ndb_21_trace.log.14 Trace file path: /app/mysql/ndbd//ndb_21_trace.log.14 [t1..t5] ***EOM***