Bug #32896 ndb_mgmd (and therefore the whole cluster) does not work properly on FreeBSD 7
Submitted: 1 Dec 2007 17:53 Modified: 19 Feb 2009 20:02
Reporter: Attila Nagy Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:5.1.22-rc OS:FreeBSD (7.0)
Assigned to: Hartmut Holzgraefe CPU Architecture:Any
Tags: cluster, freebsd, FreeBSD 7, MySQL, ndb, ndb_mgmd

[1 Dec 2007 17:53] Attila Nagy
Description:
I have installed MySQL 5.1.22-rc to four machines:
- 1x NDB management
- 2x NDB data nodes
- 1x SQL server

The problem is that even the management node fails to work correctly.

The symptom is:
- I start the management daemon
boot00a# /usr/local/libexec/ndb_mgmd --nodaemon -f /data/config.ini
NDB Cluster Management Server. Version 5.1.22 (rc)
Id: 1, Command port: 1186
setEventReportingLevelImpl: failed 2!

and try to connect from the same host:
boot00a# ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> SHOW
Connected to Management Server at: localhost:1186
Could not get status
*    60: Error
*        Time out talking to management server
ndb_mgm>

Also, the NDB data nodes and the MySQL server does not work correctly.

BTW, if I install a FreeBSD 6-STABLE environment on the same machine (in a chroot) and compile the same version of MySQL server (from FreeBSD ports) in that environment and run that ndb_mgmd binary from the (outside) 7.x environment, everything (not just the ndb_mgm, but the whole cluster, the NDB data nodes and the SQL node) works fine:

boot00a# /data/fbsd6/usr/local/libexec/ndb_mgmd --nodaemon -f /data/config.ini
NDB Cluster Management Server. Version 5.1.22 (rc)
Id: 1, Command port: 1186
setEventReportingLevelImpl: failed 2!

boot00a# ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> SHOW
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2 (not connected, accepting connect from 172.27.9.8)
id=3 (not connected, accepting connect from 172.27.9.9)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @172.27.9.2  (Version: 5.1.22)

[mysqld(API)]   1 node(s)
id=4 (not connected, accepting connect from 172.27.9.4)

ndb_mgm> 

So basically, MySQL cluster works (at least starts and can provide basic services, like creating a table, inserting, selecting some rows, I haven't yet tested it further) on FreeBSD 6, but to run that on FreeBSD 7, you have to use an ndb_mgmd from a previous (6.x) build. Other components, like the ndb_mgm, the NDB data nodes and the SQL nodes can run a full 7.x environment.

boot00a# ldd /data/fbsd6/usr/local/libexec/ndb_mgmd (6.x binary, the old libraries are available in the 7.x OS)
/data/fbsd6/usr/local/libexec/ndb_mgmd:
        libreadline.so.6 => /lib/libreadline.so.6 (0x8006f9000)
        libncurses.so.6 => /lib/libncurses.so.6 (0x800837000)
        libcrypt.so.3 => /lib/libcrypt.so.3 (0x800994000)
        libstdc++.so.5 => /usr/lib/libstdc++.so.5 (0x800aad000)
        libm.so.4 => /lib/libm.so.4 (0x800ca4000)
        libpthread.so.2 => /lib/libpthread.so.2 (0x800dc0000)
        libc.so.6 => /lib/libc.so.6 (0x800eeb000)
boot00a# ldd /usr/local/libexec/ndb_mgmd (7.x binary)
/usr/local/libexec/ndb_mgmd:
        libreadline.so.7 => /lib/libreadline.so.7 (0x8006fa000)
        libncurses.so.7 => /lib/libncurses.so.7 (0x800837000)
        libcrypt.so.4 => /lib/libcrypt.so.4 (0x800993000)
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x800aac000)
        libm.so.5 => /lib/libm.so.5 (0x800cab000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x800dc5000)
        libthr.so.3 => /lib/libthr.so.3 (0x800ed2000)
        libc.so.7 => /lib/libc.so.7 (0x800fe8000)

How to repeat:
Try to use a natively compiled ndb_mgmd on a FreeBSD 7 OS.

OS: FreeBSD/amd64 7.0-BETA3

The differences between the old (6.x) and the new (7.x) OS are significant (new malloc, new default threading library, new gcc, etc), but I couldn't yet find the exact cause of this.
[4 Dec 2007 0:22] Marek Biela
Similar problem attempting to run 3 node MySQL cluster with MySQL 5.1.22-rc on 64 bit FreeBSD 7.0-BETA3-p1 amd64. 

I compiled MySQL with this set of flags:
BUILD_OPTIMIZED=yes WITH_NDB=yes WITH_OPENSSL=yes WITH_CHARSET=utf8

Management node initially starts fine. I am able to start first NDB node and see it connecting to Management node (via ndb_mgm run either from Management or NDB node). However, when second NDB node starts successfully?, I am losing ability to use ndb_mgm tool.
I get time out error from ndb_mgm CLI:
 
"Unable to get status from management server
60: error
Time out talking to management server"

I also find "setEventReportingLevelImpl: failed 2!" error in Management server ndb_1_out.log log. Other log, ndb_1_cluster.log, contains no errors, warning or alerts, but I surely can submit it if needed.

I have no connectivity problem between servers, and I am able to telnet to default port 1186 on Management server.

MySQL node fails to start "ndbcluster" option stating:
"Configuration error: Error : Timeout talking to management node
Plugin 'ndbcluster' init funtion return error
Plugin 'ndbcluster' registration as a STORAGE ENGINE failed"
and I see no "ndbcluster" when issuing MySQL "show engines;" query 

MySQL cluster works well when I downgraded MySQL to 5.0.45_1. I used the same version of OS and cluster configuration files. Only thing different was MySQL compilation flags, that were "BUILD_OPTIMIZED=yes WITH_NDB=yes".

Regards,
Marek Biela
[13 Dec 2007 14:49] Hartmut Holzgraefe
Verified on FreeBSD 7.
Maybe another duplicate of http://bugs.mysql.com/bug.php?id=31761
due to GCC 4.2.x optimizations gone wild on the server code ...
Testing with different compile flags now ...
[14 Dec 2007 12:13] Attila Nagy
No, it's not a duplicate. I've tested that setting different compiler flags, or switching back to gcc 3.4 doesn't help.
BTW, I've found the cause: FreeBSD 7 has a new threading code (called libthr, while the other was libkse).

If I compile libkse (it doesn't get built automaticaly) and map libthr to it for ndb_mgmd, everything works fine.

So an entry in /etc/libmap.conf:
[/usr/local/libexec/ndb_mgmd]
libthr.so.3    libkse.so.3

Helps. Of course this is not a real fix, just a silly workaround.
[19 Feb 2009 20:02] Hartmut Holzgraefe
Latest MySQL Cluster 6.3.22 works just fine on FreeBSD 7.0 with libthr.so.3