| Bug #69959 | unable to start data node | ||
|---|---|---|---|
| Submitted: | 8 Aug 2013 3:56 | Modified: | 8 Feb 2014 4:59 |
| Reporter: | Firzen Le | Email Updates: | |
| Status: | No Feedback | Impact on me: | |
| Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S3 (Non-critical) |
| Version: | 7.3.2 | OS: | Linux (centos 5.x) |
| Assigned to: | CPU Architecture: | Any | |
[8 Aug 2013 8:07]
Hartmut Holzgraefe
Would be interesting to see a core dump for this ... But if your node was down for several days already then doing an --initial restart doesn't harm, its local checkpoints will be so out of date by now that it would do a full sync anyway ...
[10 Aug 2013 3:59]
Firzen Le
Hi Hartmut, Thanks for replying. I'm gonna try to enable OS's core dump file then paste it in here. By the way, may I ask if one day I have to use ndbd --initial for all data node, have I got any chance to get my data back?
[10 Aug 2013 6:27]
Hartmut Holzgraefe
If you do a full initial restart ... well, better have a recent backup that you can restore from then, and binlogs enabled on mysqld nodes so that you can do point-in-time recovery from these for the span between time of the backup and time when cluster stopped working ...
[8 Jan 2014 4:59]
MySQL Verification Team
Hello Firzen, Thank you for the report. I couldn't reproduce the reported issue. Is this issue still repeatable(with latest GA 7.3.3)? Also, could you please attach the cluster logs? Preferably using the ndb_error_reporter utility: http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-programs-ndb-error-reporter.html Thanks, Umesh
[9 Feb 2014 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".

Description: I have a cluster with 4 data nodes, 2 replicas, 2 management nodes and 1 sql node. I stopped a data node for a few days and continue running the cluster. After that I tried to start the node and got this error: "Node 13: Forced node shutdown completed. Occured during startphase 5. Initiated by signal 11. Caused by error 6000: 'Error OS signal received(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'." If I try to use ndbd --initial, it can start, but as far as I know, --initial will cause data losing, it not safe except some special case (ex upgrade version...). Here's my config.ini: [ndbd default] # Options affecting ndbd processes on all data nodes: datadir=/var/lib/mysql_ndbd/data # Directory for this data node's data files NoOfReplicas=2 # Number of replicas DataMemory=2000M # How much memory to allocate for data storage IndexMemory=512M # How much memory to allocate for index storage # For DataMemory and IndexMemory, we have used the # default values. Since the "world" database takes up # only about 500KB, this should be more than enough for # this example Cluster setup. MaxNoOfTables=3000 MaxNoOfAttributes=30000 DiskPageBufferMemory=128M MaxNoOfOrderedIndexes= 512 MaxNoOfConcurrentOperations=250000 # 800000 MaxNoOfLocalOperations=270000 [tcp default] # TCP/IP options: portnumber=2202 # This the default; however, you can use any # port that is free for all the hosts in the cluster # Note: It is recommended that you do not specify the port # number at all and simply allow the default value to be used # instead [ndb_mgmd] # Management process options: NodeId=1 hostname=192.168.1.51 # Hostname or IP address of MGM node datadir=/var/lib/mysql-cluster # Directory for MGM node log files LogDestination=FILE:filename=cluster.log,maxsize=1000000,maxfiles=6 [ndb_mgmd] # Management process options: NodeId=2 hostname=192.168.1.55 # Hostname or IP address of MGM node datadir=/var/lib/mysql-cluster # Directory for MGM node log files LogDestination=FILE:filename=cluster.log,maxsize=1000000,maxfiles=6 [ndbd] NodeId=11 NodeGroup=0 hostname=192.168.1.51 # Hostname or IP address [ndbd] NodeId=12 NodeGroup=0 hostname=192.168.1.52 # Hostname or IP address [ndbd] NodeId=13 NodeGroup=1 hostname=192.168.1.54 # Hostname or IP address [ndbd] NodeId=14 NodeGroup=1 hostname=192.168.1.59 # Hostname or IP address [mysqld] # SQL node options: NodeId=41 hostname=192.168.1.51 # Hostname or IP address [mysqld] NodeId=42 hostname=192.168.1.56 [mysqld] Does anyone has any idea of this? Thanks in advanced. How to repeat: Stop a data node for a few days. Continue running the cluster (including import new data). Start the node again.