Bug #56545 "start process -a" caused all data nodes to die and fail to restart
Submitted: 3 Sep 2010 17:18 Modified: 6 Oct 2010 8:08
Reporter: Andrew Morgan Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Cluster Manager: Agent Severity:S2 (Serious)
Version:1.1 OS:Linux
Assigned to: CPU Architecture:Any

[3 Sep 2010 17:18] Andrew Morgan
Description:
Note that this test was run using 6 Fedora VMs using VirtualBox on a single Fedora machine.

I ran these commands while traffic was running:

mysql> add process --processhosts=ndbd@192.168.0.35,ndbd@192.168.0.36 mycluster;
+------------------------------+
| Command result               |
+------------------------------+
| Processes added successfully |
+------------------------------+
1 row in set (2 min 16.18 sec)

mysql>  show status -r mycluster;
+------+----------+--------------+---------+-----------+
| Id   | Process  | Host         | Status  | Nodegroup |
+------+----------+--------------+---------+-----------+
| 1    | ndb_mgmd | 192.168.0.31 | running |           |
| 2    | ndbd     | 192.168.0.33 | running | 0         |
| 3    | ndbd     | 192.168.0.34 | running | 0         |
| 4    | mysqld   | 192.168.0.31 | running |           |
| 5    | mysqld   | 192.168.0.32 | running |           |
| 6    | ndbd     | 192.168.0.35 | added   | n/a       |
| 7    | ndbd     | 192.168.0.36 | added   | n/a       |
+------+----------+--------------+---------+-----------+
7 rows in set (0.90 sec)

mysql> start process -a mycluster;
ERROR 7006 (00MGR): Process id 6 (ndbd) terminated unexpectedly: proc_info->exitstatus=65280 proc_info->lastoutput=`2010-09-03 18:05:10 [ndbd] INFO     -- Error handler shutdown completed - exiting
' proc_report->errormsg=`process ospid 1938 exited, exitcode 65280'

I will attach the log files. Core dumps are available if required.

How to repeat:
Follow the steps above.

Suggested fix:
Identify whether issue is with Cluster or MCM and then fix or identify that the environment is the issue.
[3 Sep 2010 17:40] Andrew Morgan
Machine that was running VMs on seems to have a disk problem; should suspend this bug until reconfirmed on healthy system/
[3 Oct 2010 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".