Bug #8293 ndb_mgmd does not recognize available node ids after node crash
Submitted: 3 Feb 2005 16:11 Modified: 3 Feb 2005 19:10
Reporter: Jörg Nowak Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:4.1.9 OS:Linux (Suse 9.1 64 bit Version)
Assigned to: Assigned Account CPU Architecture:Any

[3 Feb 2005 16:11] Jörg Nowak
Description:
Bug in ndb_mgmd:

Scenario: cluster with 2 computers (2 database nodes on each computer, No. of replica: 2).
computer 1: Node 2 (master), 4 
computer 2: Node 3, 5 + Management node

I reboot computer 1 (node 2 and 4 crashed). After rebooting I tried to restart node 2 and 4 (outside of ndb_mgm).
An error occurs in the Log  file: "Nodeid 2 is allocated by another node". I'm not able to restart the crashed notes.
When I kill the ndb_mgmd and restart it then I'm able to restart the crahed notes.

It seems that ndb_mgmd occupies all node ids (the crashed ids too). This looks like a bug for me.

 

How to repeat:
reboot one of 2 database computer in the cluster and try to restart the crashed nodes.

Suggested fix:
ndb_mgmd should not occupy crashed (inactive) id's.
[3 Feb 2005 19:10] Jonas Oreland
Hi,

This situation can occur in some situations.
There are however some work arounds.
1) instead of restarting ndb_mgmd, you can issue the command "purge stale sessions"

2) You can by pass the problem totally by specifying node id in each node's connectstring & start "ndb_mgmd --no-nodeid-checks"

/Jonas

ps. we're also working on a "real" fix. But I dare not guess when that will be ready. ds
[13 Mar 2014 13:33] Omer Barnir
This bug is not scheduled to be fixed at this time.