Bug #11217 Mysqld not connected to cluster error message missleading (4009)
Submitted: 9 Jun 2005 18:42 Modified: 23 Oct 2008 4:24
Reporter: Jonathan Miller Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:4.1 OS:Linux (Linux)
Assigned to: Martin Skold CPU Architecture:Any

[9 Jun 2005 18:42] Jonathan Miller
Description:
Having just restarted the cluster leaving the mysqld process up during, I wanted to see what type of error message I would get if I tried to create a tables using NDB engine.

The error message inside the mysql client was:
mysql> create table t1 (c1 int, PRIMARY KEY(c1))ENGINE=NDB;
ERROR 1005 (HY000): Can't create table './test/t1.frm' (errno: 4009)

Not bad, but being a good MySQLer I wanted to know what a 4009 was:

./perror --ndb 4009
OS error code 4009:  Cluster Failure: Unknown result: Unknown result error

This error message is not what I would expect. I would expect an error stating that there was no connection to the NDBCLUSTER or send to NDB failed.

This would leave me to believe that my cluster had failed, but that is not the case.
Once the mysqld process is restarted all is fine.

How to repeat:
Restarted the cluster leaving the mysqld process up during.
Login into the one of the mysqld processes.
create table t1 (c1 int, PRIMARY KEY(c1))ENGINE=NDB;
bin/perror --ndb 4009

Suggested fix:
an error stating that there was no connection to the NDBCLUSTER or send to NDB failed.
[16 Jun 2005 21:44] Tomas Ulin
assigning to martin for him to decide what to do about it
[2 Nov 2005 15:55] Martin Skold
Using only the NDB API there is no way of determining the cause of
no reply from cluster. If it is being restarted, the management server (ndb_mgmd)
knows about it, so using the management client interface (already linked into
mysqld togeteher with NDB API), one could check with the management server
if a restart is in progress and return a different error code in that case.
[16 Oct 2006 13:07] Roland Bouman
I'm wondering: Can't the message returned by perror be a liilte more descriptive? It could at least give a hint that there is something wrong with the communication between the SQL node and the cluster.
[8 Aug 2007 9:30] Hartmut Holzgraefe
now reports: ERROR 157 (HY000): Could not connect to storage engine
not sure whether there already was a changelog entry for this so
setting to "Documenting" for now ...
[9 Aug 2007 7:17] Jon Stephens
Any idea when the change took place?
[23 Oct 2008 3:43] Martin Skold
Fixed in
Bug #18676 Missleading error message when trying to create table when cluster is down
Pushed into 5.1.19-beta
[23 Oct 2008 4:24] Jon Stephens
Tagged Bug#18676 changelog entry to indicate that fix resolved this bug also.

(According to developer notes for that bug, user-facing change took place in 5.0.40/5.1.18 - even though further internal changes were made in 5.1.41/5.1.19, the changelog entry is tagged for the releases where the user-visible fix took place.)