MySQL Bugs: #74503: Dict operations during TAKEOVER may crash new master

Bug #74503	Dict operations during TAKEOVER may crash new master
Submitted:	22 Oct 2014 11:29	Modified:	4 Nov 2014 18:32
Reporter:	Ole John Aske	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Disk Data	Severity:	S1 (Critical)
Version:	7.1.33	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
When a node acting as a DICT master fails, the arbitrator will select another node to take over the the DICT master responsibility. The take over procedure involves cleaning up any schema transactions which are still open when the master failed.

During this takeover period the outcome of the still open schema transaction is decided: It would normally be rolled back, but if it has completed a sufficient amount of a 'commit' request, the new master will complete the commit processing. Until the fate of the transaction has been decided, we have to hold back any TRANS_END_REQ's from the clients.

Furthermore, the dict implementation does not support multiple concurrent schema transactions. Thus, the above takeover cleanup has to be completed before any new transactions could be started.

A similar restriction also applies to any schema operations which are done in the scope of an open schema transaction: The transactions 'SafeCounter m_counter' is used to coordinate the different schema operation steps across all nodes. This is used both during the takeover processing, and when executing any 'non local' schema operations. Thus, starting a schema operation while its schema transaction is in the takeover phase, will cause the m_counter to be garbled by the two concurrent users, and the outcome is rather unpredictable.

The scenarios described above is normally hidden by a pseudo random ~100ms delay in the retry logic in NdbDictInterface::dictSignal() when it recovers from a node failure. Normally this is sufficient to let the takeover complete without any new requests arrive in the vulnerable phase, However, there are no guarantees without explicit checking for this and we are seeing randomly failures in this code from time to time (AutoTest)

How to repeat:
Reduce/remove the retry delay in ::dictSignal():

=== modified file 'storage/ndb/src/ndbapi/NdbDictionaryImpl.cpp'
--- storage/ndb/src/ndbapi/NdbDictionaryImpl.cpp	revid:ole.john.aske@oracle.com-20141021143311-wsphxp6k4rtj2bus
+++ storage/ndb/src/ndbapi/NdbDictionaryImpl.cpp	2014-10-22 08:18:52 +0000
@@ -2358,7 +2358,7 @@
 
   for(Uint32 i = 0; i<RETRIES; i++)
   {
-    if (i > 0)
+    if (i > 50)
     {
       Uint32 t = sleep + 10 * (rand() % mod);

Then run the AutoTest 'dictTest -n schemaTrans'

Posted by developer:
 
Note: A 4-node config is required in order to run all the testcases in 'testDict -n schemaTrans'

Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

Documented fix in the NDB 7.1.34, 7.2.19, and 7.3.8 changelogs, as follows:

When a node acting as a DICT master fails, the arbitrator
selects another node to take over in place of the failed node.
During the takeover procedure, which includes cleaning up any
schema transactions which are still open when the master failed,
the disposition of the uncommitted schema transaction is
decided. Normally this transaction be rolled back, but if it has
completed a sufficient portion of a commit request, the new
master finishes processing the commit. Until the fate of the
transaction has been decided, no new TRANS_END_REQ messages from
clients can be processed. In addition, since multiple concurrent
schema transactions are not supported, takeover cleanup must be
completed before any new transactions can be started.

A similar restriction applies to any schema operations which are
performed in the scope of an open schema transaction. The
counter used to coordinate schema operation across all nodes is
employed both during takeover processing and when executing any
non-local schema operations. This means that starting a schema
operation while its schema transaction is in the takeover phase
causes this counter to be overwritten by concurrent uses, with
unpredictable results.

The scenarios just described were previously handled using a
pseudo-random delay when recovering from a node failure. Now we
check before the new master has rolled forward or backwards any
schema transactions remaining after the failure of the previous
master and avoid starting new schema transactions or performing
operations using old transactions until takeover processing has
cleaned up after the abandoned transaction.

Closed.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

http://dev.mysql.com/doc/en/installing-source.html