MySQL Bugs: #69201: ndbmtd fails to start with Error 2341 after ThreadConfig changed

Bug #69201	ndbmtd fails to start with Error 2341 after ThreadConfig changed
Submitted:	10 May 2013 20:57	Modified:	19 May 2016 12:06
Reporter:	Justin Ryan	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	7.2.12	OS:	Linux (CentOS 6.4 x64)
Assigned to:	MySQL Verification Team	CPU Architecture:	Any
Tags:	2341, ndbrequire, threadconfig

Description:
Starting ndbmtd with --initial and new ThreadConfig and NoOfFragmentLogParts settings fails with:

Forced node shutdown completed. Occured during startphase 5. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

I have tried the following workarounds:

 - upgraded whole cluster 7.2.10 to 7.2.12
 - restarting different data nodes
 - started ndbmtd as root instead of mysql user
 - started ndbmtd with and without `numactl --interleave=all`
 - stopped data node through ndb_mgm, then restarted from data node
 - shutdown cluster fully, restarted with original config, change config and attempt rolling restart
 - have NOT tried full shutdown, initial restart and restore from backup

How to repeat:
1. Cluster running with MaxNoOfExecutionThreads = 8 and default NoOfFragmentLogParts (4)

2. Edit config.ini with the following changes, restart mgmd services:

-MaxNoOfExecutionThreads = 8
+ThreadConfig = ldm={count=12,cpubind=0,1,2,8,9,10,16,17,18,24,25,26},tc={count=7,cpubind=3,4,11,12,19,20,21},send={count=3,cpubind=5,13,22},recv={count=3,cpubind=6,14,23},main={cpubind=27},io={cpubind=27},rep={cpubind=28}
+NoOfFragmentLogParts = 12

3. restart one data node with `14 restart -i`

2013-05-10 16:24:05 [ndbd] INFO     -- /pb2/build/sb_0-8660699-1363118778.75/rpm/BUILD/mysql-cluster-gpl-7.2.12/mysql-cluster-gpl-7.2.12/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp
2013-05-10 16:24:05 [ndbd] INFO     -- DBLQH (Line: 8960) 0x00000006
2013-05-10 16:24:05 [ndbd] INFO     -- Error handler shutting down system
2013-05-10 16:24:05 [ndbd] INFO     -- Error handler shutdown completed - exiting
2013-05-10 16:24:19 [ndbd] ALERT    -- Node 14: Forced node shutdown completed. Occured during startphase 5. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

Suggested fix:
Unknown

error, out, mgmd, and trace logs.

Attachment: ndb_logs-69201.tgz (application/octet-stream, text), 869.61 KiB.

attachment also contains config.ini

DblqhMain.cpp:8960

  ndbrequire(tcPtr->activeCreat == Fragrecord::AC_NORMAL);

reproduced with provided config on 7.2.12, can't reproduce on up to date versions of mysql cluster