Bug #17677 ndbd-registration fails with a segmentation fault
Submitted: 23 Feb 2006 20:32 Modified: 7 Jul 2006 16:04
Reporter: Jens Schanz Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.0.18 OS:Linux (SuSE-10.0)
Assigned to: Assigned Account CPU Architecture:Any

[23 Feb 2006 20:32] Jens Schanz
Description:
If you try to register a node (ndbd), which is not defined in config.ini, you receive a segmentation fault on the node side. The management node reports the following:

2006-02-23 21:27:17 [MgmSrvr] WARNING  -- Allocate nodeid (0) failed. Connection from ip 10.2.22.42. Returned error string "No free node id found for ndbd(NDB)."
2006-02-23 21:27:17 [MgmSrvr] INFO     -- Mgmt server state: node id's  2 3 connected but not reserved
2006-02-23 21:27:17 [MgmSrvr] INFO     -- Mgmt server state: node id's  1 not connected but reserved
2006-02-23 21:27:20 [MgmSrvr] WARNING  -- Allocate nodeid (0) failed. Connection from ip 10.2.22.42. Returned error string "No free node id found for ndbd(NDB)."
2006-02-23 21:27:20 [MgmSrvr] INFO     -- Mgmt server state: node id's  2 3 connected but not reserved
2006-02-23 21:27:20 [MgmSrvr] INFO     -- Mgmt server state: node id's  1 not connected but reserved
2006-02-23 21:27:23 [MgmSrvr] WARNING  -- Allocate nodeid (0) failed. Connection from ip 10.2.22.42. Returned error string "No free node id found for ndbd(NDB)."
2006-02-23 21:27:23 [MgmSrvr] INFO     -- Mgmt server state: node id's  2 3 connected but not reserved
2006-02-23 21:27:23 [MgmSrvr] INFO     -- Mgmt server state: node id's  1 not connected but reserved

How to repeat:
define no entry for a data node in your config.ini on the management node.

Suggested fix:
Maybee this behaviour is expected, but I think, a clean error message looks better.
[23 Feb 2006 20:41] Jens Schanz
select(4, [3], NULL, NULL, {50, 0})     = 1 (in [3], left {50, 0})
recv(3, ".", 1, 0)                      = 1
select(4, [3], NULL, NULL, {50, 0})     = 1 (in [3], left {50, 0})
recv(3, "\n", 1, 0)                     = 1
select(4, [3], NULL, NULL, {50, 0})     = 1 (in [3], left {50, 0})
recv(3, "\n", 1, 0)                     = 1
open("./ndb_pid2452_error.log", O_RDWR|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("./ndb_pid2452_error.log", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40017000
time([1140727257])                      = 1140727257
open("/etc/localtime", O_RDONLY)        = 5
fstat64(5, {st_mode=S_IFREG|0644, st_size=837, ...}) = 0
fstat64(5, {st_mode=S_IFREG|0644, st_size=837, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40018000
read(5, "TZif\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\0\0\0\10"..., 4096) = 837
close(5)                                = 0
munmap(0x40018000, 4096)                = 0
write(4, "Current byte-offset of file-poin"..., 568) = 568
_llseek(4, 0, [568], SEEK_CUR)          = 0
_llseek(4, 40, [40], SEEK_SET)          = 0
write(4, "568", 3)                      = 3
close(4)                                = 0
munmap(0x40017000, 4096)                = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
[19 May 2006 8:30] Jonas Oreland
prio: 29 
This has lingered a while.
Customers keep getting it...
It's annoying...
[21 May 2006 17:20] Jonas Oreland
It's verfied...
[7 Jun 2006 16:04] Stewart Smith
It looks like we're crashing (SEGV) before we should be going through the normal exit() or abort() (depending if it's a release build or not) exit path.

Could you please provide the following:
- exact command line used for ndbd
- exact console output from ndbd
- backtrace from core file

getting a backtrace:
make sure your ulimit isn't preventing core files:
$ ulimit -c unlimited
$ /usr/local/mysql/bin/ndbd
(now should have core)
$ gdb /usr/local/mysql/bin/ndbd
(gdb) core core
(gdb) bt
(gdb) q

and add the output to this bug report.
if any of the above is unclear, let me know.
[7 Jul 2006 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".