MySQL Bugs: #54594: Data node can get in a hung state

Bug #54594	Data node can get in a hung state
Submitted:	17 Jun 2010 19:22	Modified:	12 Nov 2016 21:03
Reporter:	Andrew Hutchings	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	mysql-5.1-telco-6.3	OS:	Any
Assigned to:	Assigned Account	CPU Architecture:	Any

Description:
It is possible for a node to get in a hung state during shutdown.  This has been observed by using StopOnError=0 and a node failing with "Internal node state conflict" error.

Fix for bug#53246 may be a workaround but finding root cause is probably needed.

How to repeat:
.

Spent some time trying to reproduce with StopOnError = 0, error-injected Node state failure return code etc.

Going by the output available in the captured logs etc, it seems that the Node state failure return code was recorded in the Node's error log, but not in the stdout or cluster logs.

Based on the ErrorReporter class, this would suggest some sort of problem between the writeMessage() call which writes the Error log and Trace files, and the following g_eventLogger->info() calls. The error log and trace files seem to be successfully written, but there are no console entries. This suggests that perhaps some part of the g_eventLogger->info() call failed.

One possibility is that the logger mutex (See Logger.cpp, m_mutex) is permanently held by some other thread. This could result in the symptoms seen :
- Error and trace file written, no console log entry
- No watchdog warnings (Watchdog also uses EventLogger to output warnings and calls same NdbShutdown path when killing ndbd)

The Logger class encapsulates its use of the mutex reasonably well using block-structured locking via the Guard class.

This suggests that either :
1) Some internal call made by the logger has blocked, causing this lock to be held onto
- Perhaps ConsoleLogHandler's use of ndbout, or FileLogHandler's use of the FS.
- Where are the ndbd node's output files stored - locally or over some sort of shared storage?
- e.g. NFS could feasibly block the calling thread for some time in error scenarios
2) Perhaps some startup/shutdown dependency has been broken
- e.g.
- trying to lock a Mutex which has not yet been initialised?
- trying to lock a Mutex which has been deleted?
I would expect these to result in SEGV etc. However, if the failure occurs in signal handling /error handling code then that may in itself not resolve matters.
Looking at the g_eventLogger initialisation, it's part of ndb_init() which ndbd calls as its first step, so it seems that it cannot be uninitialised when QMGR decides to call ProgError.
One alternative is that it has already been released, but it's not clear how that could happen.

Suggested next steps :
1) Find out if there's anything weird about where the ndbds log events to (syslog, files, ?) Nothing obvious in config.ini
2) Suggest that if this can be observed again then we need to see the stack traces of the frozen processes to debug further. Not sure how to get the stack trace of a running process on Solaris.
3) In the meantime we really should methodically ensure that the Watchdog has as few shared dependencies with the rest of the code as possible. I think it would be better to interleave writes to log files, or always log to console, if
it avoids the Watchdog falling into the same pit as the ndbd. Friendly log messages can be maintained as long as they do not impede the progress of the Watchdog towards killing the ndbd. Additionally, the case where the watchdog detects ndbd failure during ndbd-shutdown should probably be treated as a special case.

Can not reproduce this any more with any of the modern, supported, MySQL Cluster versions. A big changes, since 6.3, were made in all 7.x versions, especially 7.3 and 7.4, with regards to start and stop procedure. Huge fix for node getting in to hung state on start/stop was added in 7.1 too so in any of the currently "current" versions this is not reproducible.

all best
Bogdan