MySQL Bugs: #72513: Full cluster crash during normal usage

Bug #72513	Full cluster crash during normal usage
Submitted:	2 May 2014 9:23	Modified:	1 Mar 2016 14:59
Reporter:	auesifnbiu sosdfnosinf	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S1 (Critical)
Version:	mysql-5.6.14 ndb-7.3.3	OS:	Linux (CentOS 6.4 - 2.6.32-358.23.2.el6.x86_64)
Assigned to:	MySQL Verification Team	CPU Architecture:	Any

Description:
We see one or both data nodes crash when we run a big table import (2m+ records), this happens every single time.

We recently also saw both data nodes suddenly shutdown during normal usage.

Our app uses the (deprecated) php 'mysql' module, and for management we use mysql workbench latest.

How to repeat:
Import a database where one table has 2m+ records.

partial logs and configs

Attachment: 2014-05-02.txt (text/plain), 16.16 KiB.

Hi,

I only see 3 types of crashes in your log:

Message: Memory allocation failure, please decrease some configuration parameters (Configuration error)
Error: 2327
Error data: DBTUP could not allocate memory for Fragrecord
Error object: Requested: 288x2147511844 = 618483411072 bytes

Message: Fatal error due to end of REDO log. Increase NoOfFragmentLogFiles or FragmentLogFileSize (Resource configuration error)
Error: 2354
Error data: Tail met in REDO log, logpart: 0 file: 8 mbyte: 9 state: 1 log-problem: 1

These are due to missconfiguration of the cluster. You did not assign enough resources to the cluster so when you try to load a big chunk of data you exceed the possibilities of your cluster. You either need to load data in smaller chunks or reconfigure your cluster. The Oracle MySQL Support team will happily help you do that.

The other crash I see in your log (only one, on both data nodes) is:

Time: Wednesday 30 April 2014 - 16:36:31
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: SimulatedBlock.cpp
Error object: DBSPJ (Line: 1299) 0x00000002
Program: ndbmtd
Pid: 3675 thr: 0
Version: mysql-5.6.14 ndb-7.3.3
Trace: /var/lib/mysql-cluster/1//ndb_1_trace.log.8 [t1..t5]
***EOM***

This one I need to check logs more carefully

all best
Bogdan Kecman

Hi,

The second crash is also due to missconfiguration of the cluster.
You need to properly size your cluster. 

best regards
Bogdan Kecman

p.s. This isssue is rather old, I hope you upgraded to latest 7.3 or 7.4 by now