Bug #24193 MySQL server 5.0.27 won't start properly with ndbcluster configured
Submitted: 10 Nov 2006 14:47 Modified: 20 May 2007 8:32
Reporter: Alexander List Email Updates:
Status: No Feedback Impact on me:
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.0.27-max OS:Linux (CentOS 4.4 (i386))
Assigned to: CPU Architecture:Any
Tags: cluster

[10 Nov 2006 14:47] Alexander List
When ndbcluster is configured in /etc/my.cnf, MySQL-max server will start only half...

The processes are there (mysqld_safe and mysqld-max), but it takes ~3 minutes for the PID file and the socket to appear. This prevents the initscript from starting andd shutting down MySQL cleanly.

After starting up, ndb is not available.

When I comment out all the ndb* variables from /etc/my.cnf, the server starts up without trouble.

How to repeat:
* install CentOS 4.4
* yum install perl-DBI
* download MySQL generic RPMs 
* install the RPMs
rpm -i MySQL-client-5.0.27-0.glibc23.i386.rpm  MySQL-Max-5.0.27-0.glibc23.i386.rpm  MySQL-server-5.0.27-0.glibc23.i386.rpm  MySQL-shared-compat-5.0.27-0.glibc23.i386.rpm

* install an /etc/my.cnf config file with ndbcluster present:

##### CLUSTER #####

* try to start MySQL
# date ; /etc/init.d/mysql start ; date
Fri Nov 10 15:30:56 CET 2006
Starting MySQL...................................          [FAILED]
Fri Nov 10 15:31:31 CET 2006

The processes are there:

# ps aux |grep mysql
root      1011  0.0  0.0  5084 1116 pts/6    S    15:30   0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --pid-file=/var/run/mysqld/mysqld.pid
mysql     1074  0.0  0.0 75968 3024 pts/6    S    15:30   0:00 /usr/sbin/mysqld-max --basedir=/ --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock

# ps 1011
 1011 pts/6    S      0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --pid-file=/var/run/mysqld/mysqld.pid

# ps 1074
 1074 pts/6    S      0:00 /usr/sbin/mysqld-max --basedir=/ --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock

At this time, no connection to mysql is possible:

# mysql -p
Enter password: 
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)

There are no files in /var/run/mysqld (permissions of the directory are OK).

A little more than two minutes later, the socket and the pid file appear:

while true; do date; ls -l /var/run/mysqld; sleep 1; done

Fri Nov 10 15:33:37 CET 2006
total 0
Fri Nov 10 15:33:38 CET 2006
total 0
Fri Nov 10 15:33:39 CET 2006
total 4
-rw-rw----  1 mysql mysql 5 Nov 10 15:33 mysqld.pid
srwxrwxrwx  1 mysql mysql 0 Nov 10 15:33 mysqld.sock

Now it is possible to connect to MySQL, but the ndbcluster engine is not available.

# mysql -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2 to server version: 5.0.27-max-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> show status like 'ndb%';

| Variable_name               | Value |
| Ndb_cluster_node_id         | 0     | 
| Ndb_config_from_host        |       | 
| Ndb_config_from_port        | 0     | 
| Ndb_number_of_storage_nodes | 0     | 
4 rows in set (0.00 sec)

After downgrading to 5.0.22-max, the problem disappears. Network connectivity problems to the cluster mgm/data nodes have been eliminated as a problem source.
[12 Nov 2006 14:06] Hartmut Holzgraefe
can't reproduce on SuSE 10.1 with self compiled binaries, 
maybe something related to the actual build and/or OS?
[25 Nov 2006 7:29] Valeriy Kravchuk
Please, try to use the same version, 5.0.27, on all nodes, and inform about the results.
[27 Nov 2006 9:40] Alexander List
This is a production system. Upgrade to 5.0.27 currently not possible.
[20 Apr 2007 8:32] Valeriy Kravchuk
Is it possible for you to upgrade to a newer version, 5.0.37, now?
[20 May 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".