MySQL Bugs: #71008: MySql cluster data node stopped and does not start

Bug #71008	MySql cluster data node stopped and does not start
Submitted:	26 Nov 2013 10:44	Modified:	22 Mar 2016 12:40
Reporter:	Janick Bernet	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	7.3.5	OS:	Linux (Ubuntu 12.04)
Assigned to:	MySQL Verification Team	CPU Architecture:	Any

Description:
It regularely happens that one of the data nodes stoppes and is not able to properly re-integrate in a simple 2 data node cluster (2 mysql nodes, 1 mgmt node). The log shows the following:

Time: Monday 25 November 2013 - 20:09:08
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbtupRoutines.cpp
Error object: DBTUP (Line: 728) 0x00000002
Program: ndbmtd
Pid: 1930 thr: 2
Version: mysql-5.6.11 ndb-7.3.2
Trace: /data/mysqlcluster//ndb_4_trace.log.1 [t1..t4]
***EOM***

Despite the claim for a temporary error, no startup is possible except by wiping the node using --initial.

How to repeat:
No idea the circumstances, but it happens regularely.

Ubuntu version is 12.04 LTS.

For those who might not see it: there is a private NDB error report attached.

Issue still persists after upgrading to cluster 7.3.5.

Check your data node memory usage. In practice this is significantly greater than DataMemory + IndexMemory. A normal restart seems to use more than an initial.

Check the following Ndbd_mem_manager message although it too appears to be understated. Expect to allow a minimum of 1GB additional memory for the OS and ndbd usage over that "initial" value. 

grep Ndbd_mem_manager /usr/local/mysql/data/ndb_1_out.log
2014-06-10 21:49:10 [ndbd] INFO     -- Ndbd_mem_manager::init(1) min: 6892Mb initial: 7020Mb

Use the following to disable swap. It may force you to increase RAM/decrease DataMemory to avoid a signal 9 by the oom-killer.

LockPagesInMainMemory=1

Hi,
sorry for the very late reply but it looks like your cluster is just poorly configured for your hardware. If you are still having the problem let us know, best contact MySQL Cluster support for proper configuration of your system but if you have new crash logs you can upload them here.

kind regards
Bogdan Kecman