MySQL Bugs: #8293: ndb_mgmd does not recognize available node ids after node crash

Bug #8293	ndb_mgmd does not recognize available node ids after node crash
Submitted:	3 Feb 2005 16:11	Modified:	3 Feb 2005 19:10
Reporter:	Jörg Nowak	Email Updates:
Status:	Won't fix	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	4.1.9	OS:	Linux (Suse 9.1 64 bit Version)
Assigned to:	Assigned Account	CPU Architecture:	Any

Description:
Bug in ndb_mgmd:

Scenario: cluster with 2 computers (2 database nodes on each computer, No. of replica: 2).
computer 1: Node 2 (master), 4 
computer 2: Node 3, 5 + Management node

I reboot computer 1 (node 2 and 4 crashed). After rebooting I tried to restart node 2 and 4 (outside of ndb_mgm).
An error occurs in the Log  file: "Nodeid 2 is allocated by another node". I'm not able to restart the crashed notes.
When I kill the ndb_mgmd and restart it then I'm able to restart the crahed notes.

It seems that ndb_mgmd occupies all node ids (the crashed ids too). This looks like a bug for me.

 

How to repeat:
reboot one of 2 database computer in the cluster and try to restart the crashed nodes.

Suggested fix:
ndb_mgmd should not occupy crashed (inactive) id's.

Hi,

This situation can occur in some situations.
There are however some work arounds.
1) instead of restarting ndb_mgmd, you can issue the command "purge stale sessions"

2) You can by pass the problem totally by specifying node id in each node's connectstring & start "ndb_mgmd --no-nodeid-checks"

/Jonas

ps. we're also working on a "real" fix. But I dare not guess when that will be ready. ds

This bug is not scheduled to be fixed at this time.