Bug #81510 Data Node crashes when attempting to restart with error 2341
Submitted: 19 May 2016 15:03 Modified: 14 Nov 2016 22:44
Reporter: Andrew Blackmore Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:7.4.11 OS:Ubuntu (14.04)
Assigned to: MySQL Verification Team CPU Architecture:Any

[19 May 2016 15:03] Andrew Blackmore
Description:
I have a MySQL cluster running utilizing 2 data nodes. I stopped one of the data nodes to perform some system upgrades and then when attempting to restart the data node it completes most of the process and then crashes when the data node is almost done restarting.

I have tried restarting a few times and even used the --initial. The data nodes are running ndbmtd. The errors on the data node produce this error log:

Time: Thursday 19 May 2016 - 09:08:50
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbtcMain.cpp
Error object: DBTC (Line: 19392) 0x00000002
Program: ndbmtd
Pid: 1853 thr: 8
Version: mysql-5.6.29 ndb-7.4.11
Trace: /usr/local/mysql/data/ndb_2_trace.log.7 [t1..t11]

How to repeat:
N/A
[24 May 2016 4:46] MySQL Verification Team
Line number indicates the following section of code:

void
Dbtc::executeFKChildTrigger(Signal* signal,
                            TcDefinedTriggerData* definedTriggerData,
                            TcFiredTriggerData* firedTriggerData,
                            ApiConnectRecordPtr* transPtr,
                            TcConnectRecordPtr* opPtr)
{
  Ptr<TcFKData> fkPtr;
  // TODO make it a pool.getPtr() instead
  // by also adding fk_ptr_i to definedTriggerData
  ndbrequire(c_fk_hash.find(fkPtr, definedTriggerData->fkId));  <<<<<<<< error <<<<<<

  switch (firedTriggerData->triggerEvent) {
  case(TriggerEvent::TE_INSERT):
    jam();
    /**
     * Check that after values exists in parent table
     */
    fk_readFromParentTable(signal, firedTriggerData, transPtr, opPtr, fkPtr.p);
    break;
  case(TriggerEvent::TE_UPDATE):
    jam();
    /**
     * Check that after values exists in parent table
     */
    fk_readFromParentTable(signal, firedTriggerData, transPtr, opPtr, fkPtr.p);
    break;
  default:
    ndbrequire(false);
  }
}
[24 May 2016 4:48] MySQL Verification Team
This appears to be a known bug as described below.

Documented fix as follows in the NDB 7.3.14, 7.4.12, 7.5.2 changelogs:
 
    During a node restart, re-creation of internal triggers used to
    verify the referential integrity of foreign keys was not
    reliable, due to the fact that not all distributed TC and LDM
    instances agreed on trigger identities. To fix this problem, an
    extra step is added to the node restart sequence, during which
    the trigger identities are determined from the current master
    node.
[9 Sep 2016 12:29] Filip Kryspin
Hi,

I'm running mysql cluster: MySQL-Cluster-server-gpl-7.4.8-1.el7.x86_64

I have same error, but it happends 5 minutes after the node has started.
Please see the log below:

2016-09-09 14:14:29 [ndbd] INFO     -- Start phase 101 completed
2016-09-09 14:14:29 [ndbd] INFO     -- Phase 101 was used by SUMA to take over responsibility for sending some of the asynchronous ch
2016-09-09 14:14:29 [ndbd] INFO     -- Node started
2016-09-09 14:20:15 [ndbd] INFO     -- /export/home2/pb2/build/sb_0-16730888-1444652131.27/rpm/BUILD/mysql-cluster-gpl-7.4.8/mysql-cluster-gpl-7.4.8/storage/ndb/src/kernel/blocks/dbtc/DbtcMain.cpp
2016-09-09 14:20:15 [ndbd] INFO     -- DBTC (Line: 19292) 0x00000002
2016-09-09 14:20:15 [ndbd] INFO     -- Error handler shutting down system
2016-09-09 14:20:15 [ndbd] INFO     -- Error handler shutdown completed - exiting
2016-09-09 14:20:15 [ndbd] DEBUG    -- Angel got child 45083
2016-09-09 14:20:15 [ndbd] DEBUG    -- error: 2341, signal: 0, sphase: 255
2016-09-09 14:20:15 [ndbd] ALERT    -- Node 4: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
[9 Sep 2016 12:56] Andrew Blackmore
Filip,

I would recommend upgrading to the newest version of the cluster. As Jonathon stated the issue that I encountered in 7.4.11 was fixed in 7.4.12.
[14 Nov 2016 22:44] MySQL Verification Team
Thank you for your bug report. This issue has already been fixed in the latest released version of that product, which you can download at

  http://www.mysql.com/downloads/