Bug #39318 NDB data node crashes
Submitted: 8 Sep 2008 15:43 Modified: 10 Sep 2008 10:49
Reporter: Matthew Robinson Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.0.51 OS:Linux (Debian Etch)
Assigned to: CPU Architecture:Any
Tags: crash, data node

[8 Sep 2008 15:43] Matthew Robinson
Description:
Since this morning, one data node has crashed 3 times with the following error:

Time: Monday 8 September 2008 - 16:08:27
Status: Temporary error, restart node
Message: Temporary on access to file (Internal error, programming error or missing error message, please report a bug)
Error: 2809
Error data: DBACC: File system open failed. OS errno: 5
Error object: DBACC (Line: 1819) 0x0000000e
Program: /usr/sbin/ndbd
Pid: 568
Trace: /var/lib/mysql-cluster/ndb_3_trace.log.7
Version: Version 5.0.51
***EOM***

I have done an initial start both times, which temporarily fixed the issue.

How to repeat:
I dont know how to repeat

Suggested fix:
I dont know how to fix
[9 Sep 2008 5:09] Bernd Ocklin
Hi Matthew,

can you attach the config file, cluster logs as produced by the management server, out files and full trace files? You can the command ndb_error_reporter to collect all those files.
[9 Sep 2008 8:17] Matthew Robinson
Uploaded to debian FTP site as:

ndb_error_report_20080909091521.tar.bz2
[9 Sep 2008 8:17] Matthew Robinson
Sorry - i meant mysql ftp site:

ftp://ftp.mysql.com/pub/mysql/upload/
[9 Sep 2008 8:20] Matthew Robinson
It seems I was premature with telling you I had uploaded it to the FTP site - I had some problems with that. Anyway, it can be downloaded from our webserver:

https://www.fone-me.com/ndb_error_report_20080909091521.tar.bz2
[9 Sep 2008 15:24] Matthew Robinson
Update - no crashes since the bug was reported. Our usage hasn't changed ( we've been running the same setup for a couple of months ).
[10 Sep 2008 10:18] Bernd Ocklin
Quite likely you have/had a file system / disk problem as OS error 5 is io error not related to cluster in the first place. 

I will close bug for now as "not a bug" but you are welcome to open it again if you have other findings.
[10 Sep 2008 10:49] Matthew Robinson
Ok, thanks for looking.