Bug #31385 Ndbd file system inconsistency error, please report a bug
Submitted: 4 Oct 2007 0:45 Modified: 19 Apr 2008 9:01
Reporter: Oleg Baranov Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:5.1.21 (beta) OS:FreeBSD (6.2_STABLE)
Assigned to: CPU Architecture:Any

[4 Oct 2007 0:45] Oleg Baranov
Description:
After running for about two weeks one of the cluster nodes collapsed.
The same did the other one when I tried to restart the first.

-----------------------error log -----------------------
Time: Thursday 4 October 2007 - 03:31:58
Status: Temporary error, restart node
Message: Assertion (Internal error, programming error or missing error message, please report a bug)
Error: 2301
Error data: ArrayPool<T>::getPtr
Error object: ../../../../../storage/ndb/src/kernel/vm/ArrayPool.hpp line: 441 (block: DBTUP)
Program: /usr/local/mysql/libexec/ndbd
Pid: 6560
Trace: /data/ndbdata/ndb_3_trace.log.2
Time: Thursday 4 October 2007 - 03:31:58
***EOM***

Time: Thursday 4 October 2007 - 03:34:27
Status: Ndbd file system error, restart node initial
Message: Error while reading the REDO log (Ndbd file system inconsistency error, please report a bug)
Error: 2310
Error data: Error while reading REDO log. from 15287
D=11, F=1 Mb=108 FP=3487 W1=8185 W2=7 : end of log wo/ having found last GCI
Error object: DBLQH (Line: 15393) 0x0000000a
Program: /usr/local/mysql/libexec/ndbd
Pid: 8536
Trace: /data/ndbdata/ndb_3_trace.log.3
Version: Version 5.1.21 (bet

all logs from the crashed node are posted here:
http://ol.homeunix.org/~ol/ndb_failure_01/

How to repeat:
I issued innocent command like "update objects set name = 'lalala'" on a small ndb table with just 1 or 2 rows. 
There was no great activity that time. Only a small application inserting a row per second into another table.
[4 Oct 2007 1:43] Oleg Baranov
My symptoms are similar to Bug #25924 (despite the fact 25924 is rather old).

All corrupted filesystems data is backed up to be provided on demand.
I tried to re-init nodes. Both behave similar: when I run --initial on one of them, another one says that bloody "... file system inconsistency error...".
So, I've lost all cluster data.
[18 Oct 2007 18:54] Shivani Goyal
I tried to run the similar update command and got the same error, data node forced down. Please help this is serious.
[14 Jan 2008 15:10] Robby Dermody
I am getting this problem as well. Quite serious...
[14 Jan 2008 16:55] Tomas Ulin
If you are getting this error please upload logs.  The more logs we get the better the chance we can find the cause of this.

BR,

Tomas
[20 Feb 2008 9:23] xu rongzhong
We can recur this bug, I'am in china, I you need the logs, please contact me!
[20 Feb 2008 23:57] xu rongzhong
log

Attachment: mysql-cluster-mgm.tar.rar (application/octet-stream, text), 302.21 KiB.

[19 Mar 2008 9:01] Tomas Ulin
Sorry for not coming back sooner on this.  The latest log says 5.1.17 is used, we cannot see from the log if this bug has been fixed or not in a later release, nor what the bug is.

May we ask you to reproduce with 5.1.23, the latest.  And we prefer to have a reproducable test case, rather than logs, if possible.

Also there have been several posts to this bug, most likely all are different bugs.

Anyone who is seeing similar problems, please post all your logs.  Otherwise we cannot see if the problem is the same, the error message as such is _not_ sufficient.

Thanks,

Tomas
[19 Apr 2008 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[17 Oct 2008 21:24] Brian Moon
I've reproduced it on 5.1.23,  Let me know if there are any other log files needed.

Thanks!