Bug #54406 ndbd cannot start due to error 721
Submitted: 10 Jun 2010 17:42 Modified: 12 Oct 2010 13:44
Reporter: Nicholas Hill Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:mysql-5.1-telco-7.1 OS:Linux (CentOS 5 x86_64)
Assigned to: Magnus Blåudd CPU Architecture:Any
Tags: 7.1.3, 721, dbdict, error 721, MySQL Cluster, ndb, ndb-7.1.3, ndbd, redo

[10 Jun 2010 17:42] Nicholas Hill
Description:
After a complete cluster shutdown to increase number of virtual cpus in VMWare ESXi, I was unable to bring the ndbd nodes back up due to the following error reported by ndbd:

2010-06-10 13:23:16 [ndbd] INFO     -- Failure to recreate object during restart, error 721 Please follow instructions from 'perror --ndb 721'
2010-06-10 13:23:16 [ndbd] INFO     -- DBDICT (Line: 4237) 0x00000002
error=2355
2010-06-10 13:23:16 [ndbd] INFO     -- Error handler startup shutting down system
2010-06-10 13:23:16 [ndbd] INFO     -- Error handler shutdown completed - exiting
sphase=4
exit=-1

How to repeat:
After a complete (clean) cluster shutdown, I restarted the ndbd nodes and received the previous error.

Suggested fix:
A fix was supposed to be pushed through for version 7.1.3
[10 Jun 2010 17:43] Nicholas Hill
ndb_error_report log

Attachment: ndb_error_report_20100610133239.tar.bz2 (application/octet-stream, text), 393.99 KiB.

[10 Jun 2010 17:45] Nicholas Hill
This is similar to Bug #52135
[10 Jun 2010 17:45] Andrew Hutchings
Error seems to be like bug #52135.  But this is in the fixed version.
[10 Jun 2010 19:06] Nicholas Hill
After running an ndbd --initial on each of my ndb nodes, the cluster ran properly once again.

I started to perform a rolling restart by shutting down one of the nodes through ndb_mgm and restarting the node and was presented with the same error and am no longer able to restart that node, even with the --initial switch.

ndb_error_report to follow
[10 Jun 2010 19:13] Nicholas Hill
I have uploaded file ndb_error_report_54406.tar.bz2 to the write only FTP.
[10 Jun 2010 19:20] Nicholas Hill
here is a mysqldump -A --no-data dump

Attachment: mysqldump_54406.sql (application/octet-stream, text), 45.35 KiB.

[10 Jun 2010 21:07] Andrew Hutchings
Haven't been able to reproduce with these schemas yet
[12 Oct 2010 13:06] Axel Schwenke
This might be related to bug #54651. This bug would leave an invalid cluster dictionary and any node restart after that will fail with error 721.
[12 Oct 2010 13:44] Magnus Blåudd
Analyzing the attached trace files shows that this is a duplicate of Bug#54651 which alloes a table to be altered to the same name as an already existing table. The duplicate table name problem is not detected until the next node or system restart and cause the above mentioned error message to be printed.

Since this problem can happen as part of an upgrade from a version where Bug#54651 has not yet been fixed we will modify the error message printouts for this case to be more helpful and avoid refering to "perror --ndb 721" since that is not very helpful.