Bug #15182 | Error 2310 when starting up the cluster | ||
---|---|---|---|
Submitted: | 23 Nov 2005 13:18 | Modified: | 10 Feb 2006 14:08 |
Reporter: | Chris Kennedy | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | 5.0.15 | OS: | Linux (Red Hat Enterprise Linux) |
Assigned to: | CPU Architecture: | Any |
[23 Nov 2005 13:18]
Chris Kennedy
[24 Nov 2005 7:27]
Tomas Ulin
We would need your logs and filesystem to analyze this. All ndb_* files and directories. Moreover, did the restart of the first node really work? It is not correct that you got the "Node shutdown would cause system crash" in that case. It indicates that the first node either failed to restart or it hadn't finished restarting. About your later "filesystem" error, it is recoverable under certain conditions. The error message states "Ndbd file system error, restart node initial". I.e. if the other node has an ok filesystem it can recover from that one by starting this node "ndbd --initial". However to analyze if this is a possibility in this case we would need the logs mentioned above. Also we would like to see your filesystem before you do this to try to find out what the problem is. BR, Tomas
[24 Nov 2005 8:26]
Chris Kennedy
files from node showing the error
Attachment: 20051123.tar.gz (application/x-gzip-compressed, text), 44.10 KiB.
[24 Nov 2005 8:33]
Chris Kennedy
ndb_mgm reported the node restart was complete: inm_mgr@jabba1.vfl.vodafone> ndb_mgm -- NDB Cluster -- Management Client -- ndb_mgm> show Connected to Management Server at: localhost:1186 Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=3 @127.0.0.1 (Version: 5.0.15, Nodegroup: 0) id=4 @10.15.1.172 (Version: 5.0.15, Nodegroup: 0, Master) [ndb_mgmd(MGM)] 1 node(s) id=9 @127.0.0.1 (Version: 5.0.15) [mysqld(API)] 11 node(s) id=20 @10.15.1.171 (Version: 5.0.15) id=21 (not connected, accepting connect from any host) id=22 (not connected, accepting connect from any host) id=23 (not connected, accepting connect from any host) id=24 (not connected, accepting connect from any host) id=25 (not connected, accepting connect from any host) id=26 (not connected, accepting connect from any host) id=27 (not connected, accepting connect from any host) id=28 (not connected, accepting connect from any host) id=29 (not connected, accepting connect from any host) id=30 (not connected, accepting connect from any host) ndb_mgm> 4 stop Node 4: Node shutdown aborted Shutdown failed. * 2002: Stop failed * Node shutdown would cause system crash I am atttaching log and error files from the systems. I am afraid it will not be possible at this time to give you access to the file system. I will be able to forward info you might need from it though.
[24 Nov 2005 8:43]
Chris Kennedy
files from second node
Attachment: 20051123_a.tar.gz (application/x-gzip-compressed, text), 55.31 KiB.
[25 Nov 2005 8:45]
Jörg Nowak
I have to same bug in Version 5.1.2-drop5p5 on Suse 64 bit. In detail: I was doing some tests with the replication feature. For that I built 2 clusters and configured one mysqld in the flirst cluster as master and one mysqld as slave in seccond cluster and but some load on the cluster with the master. After some time both ndbd nodes on the computer where the slave run crashed. I was able to restart one of them but the other doesn't restart: 2005-11-25 09:11:17 [MgmSrvr] ALERT -- Node 4: Forced node shutdown completed. Occured during startphase 5. Initiated by signal 0. Caused by error 2809: 'Temporary on access to file(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node' I tried ndbd -d and ndbd --initial. Both fails.
[10 Jan 2006 14:08]
Hartmut Holzgraefe
Jörg, can you add the nodes error log and trace log files, too? The error message from the cluster log alone is not sufficient for further investigation as it misses some information ...
[11 Feb 2006 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".