Bug #72513 Full cluster crash during normal usage
Submitted: 2 May 2014 9:23 Modified: 1 Mar 2016 14:59
Reporter: auesifnbiu sosdfnosinf Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:mysql-5.6.14 ndb-7.3.3 OS:Linux (CentOS 6.4 - 2.6.32-358.23.2.el6.x86_64)
Assigned to: MySQL Verification Team CPU Architecture:Any

[2 May 2014 9:23] auesifnbiu sosdfnosinf
Description:
We see one or both data nodes crash when we run a big table import (2m+ records), this happens every single time.

We recently also saw both data nodes suddenly shutdown during normal usage.

Our app uses the (deprecated) php 'mysql' module, and for management we use mysql workbench latest.

How to repeat:
Import a database where one table has 2m+ records.
[2 May 2014 9:25] auesifnbiu sosdfnosinf
partial logs and configs

Attachment: 2014-05-02.txt (text/plain), 16.16 KiB.

[1 Mar 2016 14:36] MySQL Verification Team
Hi,

I only see 3 types of crashes in your log:

Message: Memory allocation failure, please decrease some configuration parameters (Configuration error)
Error: 2327
Error data: DBTUP could not allocate memory for Fragrecord
Error object: Requested: 288x2147511844 = 618483411072 bytes

Message: Fatal error due to end of REDO log. Increase NoOfFragmentLogFiles or FragmentLogFileSize (Resource configuration error)
Error: 2354
Error data: Tail met in REDO log, logpart: 0 file: 8 mbyte: 9 state: 1 log-problem: 1

These are due to missconfiguration of the cluster. You did not assign enough resources to the cluster so when you try to load a big chunk of data you exceed the possibilities of your cluster. You either need to load data in smaller chunks or reconfigure your cluster. The Oracle MySQL Support team will happily help you do that.

The other crash I see in your log (only one, on both data nodes) is:

Time: Wednesday 30 April 2014 - 16:36:31
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: SimulatedBlock.cpp
Error object: DBSPJ (Line: 1299) 0x00000002
Program: ndbmtd
Pid: 3675 thr: 0
Version: mysql-5.6.14 ndb-7.3.3
Trace: /var/lib/mysql-cluster/1//ndb_1_trace.log.8 [t1..t5]
***EOM***

This one I need to check logs more carefully

all best
Bogdan Kecman
[1 Mar 2016 14:59] MySQL Verification Team
Hi,

The second crash is also due to missconfiguration of the cluster.
You need to properly size your cluster. 

best regards
Bogdan Kecman

p.s. This isssue is rather old, I hope you upgraded to latest 7.3 or 7.4 by now