Bug #73691 ndbd crash with segfault when missing DATAFILE
Submitted: 22 Aug 2014 22:55 Modified: 28 Nov 2015 10:25
Reporter: Arturo Garcia Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:7.3.5 OS:Linux (Red Hat EL 6.4)
Assigned to: MySQL Verification Team CPU Architecture:Any
Tags: cluster, Datafile, ndbd, segfault

[22 Aug 2014 22:55] Arturo Garcia
Description:
I just change all my tables from STORAGE MEMORY configuration to STORAGE DATA DISK. I created LOGFILE GROUP, TABLESPACE and DATAFILES. 
When I reorganized all partitions, and after a cluster reboot, one node did not can start and showed a signal 11 error.

Time: Friday 22 August 2014 - 18:38:31
Status: Temporary error, restart node
Message: Error OS signal received (Internal error, programming error or missing error message, please report a bug)
Error: 6000
Error data: Signal 11 received; Segmentation fault
Error object: /pb2/build/sb_0-11918329-1396368453.7/rpm/BUILD/mysql-cluster-com-7.3.5/mysql-cluster-com-7.3.5/storage/ndb/src/kernel/ndbd.cpp
Program: ndbd
Pid: 7317
Version: mysql-5.6.17 ndb-7.3.5
Trace: /u01/mysql-cluster/ndb_1_trace.log.6
Time: Friday 22 August 2014 - 19:00:13
Status: Temporary error, restart node
Message: Error OS signal received (Internal error, programming error or missing error message, please report a bug)
Error: 6000
Error data: Signal 11 received; Segmentation fault
Error object: /pb2/build/sb_0-11918329-1396368453.7/rpm/BUILD/mysql-cluster-com-7.3.5/mysql-cluster-com-7.3.5/storage/ndb/src/kernel/ndbd.cpp
Program: ndbd
Pid: 10508
Version: mysql-5.6.17 ndb-7.3.5
Trace: /u01/mysql-cluster/ndb_1_trace.log.7

I realized than a datafile doesn't be created in the node that can't start, it exists in other nodes but not in the node with the issue.
I don't know why the DATA FILE couldn't be created in that node, but maybe thats the problem.

How to repeat:
Scenario: 
Mysql Cluster -> 6 nodes
A bunch of 202 tables with Data Disk Storage Configuration.
Delete one DATAFILE in a node a try to restart the ndbd process. It will crash with segfault.
[22 Aug 2014 23:00] Arturo Garcia
Trace Log File

Attachment: ndb_1_trace.log.7 (application/octet-stream, text), 1.09 MiB.

[28 Oct 2015 10:25] MySQL Verification Team
Hi,
to solve a problem start the failing node with --initial

as for how the problem happened in the first place (where is the missing datafile) we could look at the logs from the time where you created the tablespace and did the alter. It's most probably some system (non-cluster) problem that got you into this situation. 

The attached trace just show that the filesystem is corrupt (missing datafile will do that to you) so --initial solves the problem as the datadir on the node will be rebuilt from scratch

all best
Bogdan Kecman
[29 Nov 2015 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".