MySQL Bugs: #73015: Node restarts while another node is already starting may crash \'master\'

Bug #73015	Node restarts while another node is already starting may crash \'master\'
Submitted:	16 Jun 2014 11:32	Modified:	20 Jun 2014 15:19
Reporter:	Ole John Aske	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S1 (Critical)
Version:	7.3.6	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:

This is a regression introduced by the fix for bug#16007980

DATA NODE STUCK IN PHASE 1 WHEN OTHER NODE LOSES NETWORK

This regression introduced a situation where a node might
crash while handling the Ndbcntr::execNODE_FAILREP signal:

Thread 1 (Thread 0x7f614cfc8700 (LWP 6950)):
#0  clear (this=0x1e3a790, signal=0x7f614cfbd240) at /space/autotest/build/clone-mysql-5.5-cluster-7.2-2014-06-15.18159/storage/ndb/include/util/Bitmask.hpp:332
#1  clear (this=0x1e3a790, signal=0x7f614cfbd240) at /space/autotest/build/clone-mysql-5.5-cluster-7.2-2014-06-15.18159/storage/ndb/include/util/Bitmask.hpp:1178
#2  clear (this=0x1e3a790, signal=0x7f614cfbd240) at /space/autotest/build/clone-mysql-5.5-cluster-7.2-2014-06-15.18159/storage/ndb/include/util/Bitmask.hpp:1185
#3  Ndbcntr::execNODE_FAILREP (this=0x1e3a790, signal=0x7f614cfbd240) at /space/autotest/build/clone-mysql-5.5-cluster-7.2-2014-06-15.18159/storage/ndb/src/kernel/blocks/ndbcntr/NdbcntrMain.cpp:2071
#4  0x00000000006fd78a in executeFunction (selfptr=0x7f614e23b360, q=<value optimized out>, h=<value optimized out>, r=<value optimized out>, sig=0x7f614cfbd240, max_signals=100, signalIdCounter=0x7f614cfc7dbc) at /space/autotest/build/clone-mysql-5.5-cluster-7.2-2014-06-15.18159/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1069
#5  execute_signals (selfptr=0x7f614e23b360, q=<value optimized out>, h=<value optimized out>, r=<value optimized out>, sig=0x7f614cfbd240, max_signals=100, signalIdCounter=0x7f614cfc7dbc) at /space/autotest/build/clone-mysql-5.5-cluster-7.2-2014-06-15.18159/storage/ndb/src/kernel/vm/mt.cpp:3689
#6  0x00000000006fdb57 in run_job_buffers (selfptr=0x7f614e23b360, sig=0x7f614cfbd240, signalIdCounter=0x7f614cfc7dbc) at /space/autotest/build/clone-mysql-5.5-cluster-7.2-2014-06-15.18159/storage/ndb/src/kernel/vm/mt.cpp:3774
#7  0x0000000000700138 in mt_job_thread_main (thr_arg=0x7f614e23b360) at /space/autotest/build/clone-mysql-5.5-cluster-7.2-2014-06-15.18159/storage/ndb/src/kernel/vm/mt.cpp:4500
#8  0x00000000006aad9e in ndb_thread_wrapper (_ss=0x1ca0960) at /space/autotest/build/clone-mysql-5.5-cluster-7.2-2014-06-15.18159/storage/ndb/src/common/portlib/NdbThread.c:201
#9  0x00007f61503af851 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f614f33b11d in clone () from /lib64/libc.so.6

---> signal
NdbcntrMain.cpp      02941 

--------------- Signal ----------------
r.bn: 251 "NDBCNTR", r.proc: 2, r.sigId: 227112 gsn: 26 "NODE_FAILREP" prio: 1
s.bn: 252 "QMGR", s.proc: 2, s.sigId: 227108 length: 5 trace: 8 #sec: 0 fragInf: 0
 H'00000003 H'00000002 H'00000002 H'00000009 H'00000000

The crash happens in the following code:

...........
   Uint32 nodeId = 0;
   while(!allFailed.isclear()){
     nodeId = allFailed.find(nodeId+1);
     allFailed.clear(nodeId);   << Crash
     signal->theData[1] = nodeId;
     sendSignal(CMVMI_REF, GSN_EVENT_REP, signal, 3, JBB);
   }//for
..........

Further debugging shows that the bit corresponding to nodeId=0
has been set in the allFailed BitMask which is sent as part
of the NODE_FAILREP signal. This is an illegal nodeId, and 
should never have been set (garbage?).

How to repeat:
./testNodeRestart -n Bug42422 -l 1 T1

Posted by developer:
 
Add some text from the commit comments possible being helpful
when later documenting this fix:
...............

      A regression were introduced in the fix for bug#16007980                       
                                                                                     
        DATA NODE STUCK IN PHASE 1 WHEN OTHER NODE LOSES NETWORK                     
                                                                                     
      That fix added the following code to Qmgr::failReportLab()                     
                                                                                     
      +  /**                                                                         
      +   * If any node is starting now (c_start.startNode != 0)                     
      +   *   sendPrepFailReq to that too                                            
      +   */                                                                         
      +  if (c_start.m_startNode != 0)                                               
      +  {                                                                           
      +    jam();                                                                    
      +    cfailedNodes[cnoFailedNodes++] = c_start.m_startNode;                     
      +    c_start.reset();                                                          
      +  }                                                                           
                                                                                     
      However, we could already have been notified about the                         
      failure of the same node through any of the other 'channels'                   
      which handle failures or disconnects. Thus, we could end                       
      up with duplicates of the same NodeId in cfailedNodes[].                       
                                                                                     
      Later, this 'list of nodes' is converted into a BitMask                        
      used in the PREP_FAILREQ-signal, and converted back into                       
      a 'list of nodes' by Qmgr::execPREP_FAILREQ(). During this                     
      BitMask conversion, the duplicated NodeId is eliminated.                       
      However, **the 'noOfNodes' count is kept unchanged**.                          
      Thus we end up with a materialized 'list of nodes' where                       
      the size is of-by-one, and the last item contains garbage.

Documented fix as follows in the NDB 7.1.32, 7.2.17, and 7.3.6 changelogs:

        Processing a NODE_FAILREP signal that contained an invalid node
        ID could cause a data node to fail. Regression of BUG#16007980.
      

Closed.

Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html