MySQL Bugs: #51056: Node can not restart after crash error 2341

Bug #51056	Node can not restart after crash error 2341
Submitted:	10 Feb 2010 11:21	Modified:	16 Apr 2010 6:06
Reporter:	Christian Loebbert	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S1 (Critical)
Version:	mysql-5.1-telco-7.0	OS:	Linux (SuSE Linux 11.1 2.6.27.39-0.2-xen)
Assigned to:	Pekka Nousiainen	CPU Architecture:	Any
Tags:	7.0.7, crash, error 2341, node, restart

Description:
For some reason one of my 2 ndb nodes crashed. Since then, I am not able to bring the crashed node back into a working state. It allways crash with error 2341.

What can I do to bring this back to work? 
Where can I upload ndb_error_reporter file?

How to repeat:
please try to repeat this from ndb_error_reporter, because I don't know the reason why the node was crashed.

Christian,

So, node 3 is up and trying to start node 4 with --initial
causes node 4 to crash.

The apparent cause is that node 4 gets a duplicate tablename
from node 3.  In the Feb 8 log this is UserTable with id 3888.
This indicates node 3 dictionary is corrupt.

Look at ndb_show_tables output.  If possible, upload the output
and directories D1,D2 from node 3, e.g:

tar cf node3fs.tar ndb_3_fs/D[12]

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

Christian,

The output of:

 ndb_show_tables

described at:

 http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-programs-ndb-show-tables.html

could be useful.

/Gustaf

after serveral weeks I have done a mysqldump of all databases and restarted both cluster nodes with --initial.
Both nodes started without problems, but of course all my databases are lost. I have imported them from mysqldump and everything is fine. Therefore I guess output from ndb_show_tables is not usefull for your. Nevertheless I have attached that output and hope you can find something what give me the possibility to bring the cluster back to work without using a backup and a long downtime.

I guess you didn't see my response from Feb 15
which was mistakenly hidden from public.

The problem obviously was dictionary corruption
and finding out the reason is unlikely.  Good
you got the db up.  I'll close this bug.