Bug #67034 Error reporting when node attempting to join NDB cluster
Submitted: 1 Oct 2012 11:57
Reporter: Brian Hobson Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: Errors Severity:S4 (Feature request)
Version:7.x OS:Linux (RHEL 5/6)
Assigned to: CPU Architecture:Any
Tags: ndb error state

[1 Oct 2012 11:57] Brian Hobson
Description:
In my cluster I have occasionally seen NDBD crash and fail to start and the only way I have known that there was an error (other than seeing that it never gets past the "Starting" state when viewing ndb_mgm -e show) is to tail the ndb_4_out.log. Occasionally I will see something similar to "Error 2341: Internal program error" 

It would be very useful if there were a way for NDBD to indicate that an error has occurred during startup. I have tried using tools such as ndb_waiter and ndb_mgm which do a great job at showing me the state of the nodes at a given time, however I am trying to script something up that will alert someone when an NDBD error is occurring, not just that it is "Starting" or "Not Connected".  Possibly an "Error" state or something similar which will make it easier to automate error handling and recovery when a node fails to start.

Parsing the ndb_?_out.log file is the only way i've found to detect the specific error and is not really feasible to constantly monitor.

How to repeat:
N/A

Suggested fix:
Add an ability to monitor for an error condition when nodes are part of a cluster.