Bug #25280 temporary bug in MySQL 5.1.11 cluster
Submitted: 26 Dec 2006 11:42 Modified: 24 Apr 2007 3:09
Reporter: Mitul Savani Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.1.14, 5.1.11-0 OS:Linux (Linux AS release 4 (Nahant))
Assigned to: CPU Architecture:Any
Tags: NDB Cluster

[26 Dec 2006 11:42] Mitul Savani
Description:
Hi,

I have cluster with below mentioned configuration:

1) API node 512MB RAM and P4 cpu (IP: 172.18.1.138)
2) MGM node 512MB RAM and P4 cpu (IP: 172.18.1.139)
3) Data node 2GB RAM and P4 cpu (IP: 172.18.1.140)
4) Data node  2GB RAM and P4 cpu (IP: 172.18.1.141)

Configuration file of each server is as mentioned below:

===============================
[MGM Node]# cat config.ini
================================
[NDBD DEFAULT]
NoOfReplicas= 2
RedoBuffer=64M
TimeBetweenLocalCheckpoints=6
NoOfFragmentLogFiles=32

[MYSQLD DEFAULT]

# Management Server
[NDB_MGMD]
Id=1
HostName= 172.18.1.139

[NDBD]
Id=3
HostName=172.18.1.140
DataDir= /var/lib/mysql-cluster
#DataMemory = 1400M
#IndexMemory = 200M
MaxNoOfConcurrentTransactions = 500
MaxNoOfConcurrentOperations = 250000
MaxNoOfOrderedIndexes = 57000
MaxNoOfTables = 9000
MaxNoOfAttributes = 25000

[NDBD]
Id=4
HostName=172.18.1.141
DataDir= /var/lib/mysql-cluster
#DataMemory = 1400M
#IndexMemory = 200M
MaxNoOfConcurrentTransactions = 500
MaxNoOfConcurrentOperations = 250000
MaxNoOfOrderedIndexes = 57000
MaxNoOfTables = 9000
MaxNoOfAttributes = 25000

# TCP/IP Connections
[TCP]
NodeId1=3
NodeId2=4
HostName1=172.18.1.140
HostName2=172.18.1.141

# SQL Node
[MYSQLD]
Id=2
HostName=172.18.1.138

==========================
Data node my.cnf file:
=========================
[mysqld]
ndbcluster
# IP address of the cluster management node
ndb-connectstring=172.18.1.139
[mysql_cluster]
# IP address of the cluster management node
ndb-connectstring=172.18.1.139

=======================
SQL API node config
=======================
[MYSQLD]                        
ndbcluster                      # run NDB engine
ndb-connectstring=172.18.1.139  # location of MGM node

# Options for ndbd process:
[MYSQL_CLUSTER]                 
ndb-connectstring=172.18.1.139  # location of MGM node

===========================

Now, when I start the cluster (mangment node) I am receiving below mentioned error:

[root@rhel4mysql2 mysql-cluster]# ndb_mgmd -d
[root@rhel4mysql2 mysql-cluster]# ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @172.18.1.140  (Version: 5.1.11, starting, Nodegroup: 0)
id=4 (not connected, accepting connect from 172.18.1.141)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @172.18.1.139  (Version: 5.1.11)

[mysqld(API)]   1 node(s)
id=2 (not connected, accepting connect from 172.18.1.138)

ndb_mgm> Node 4: Forced node shutdown completed. Occured during startphase 1. Initiated by signal 0. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error
Node 3: Forced node shutdown completed. Occured during startphase 1. Initiated by signal 0. Caused by error 2308: 'Another node failed during system restart, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'.

Can anyone suggest me correct parameter or is this bug?

How to repeat:
Configure cluster as per above configuration

Suggested fix:
Need help
[26 Dec 2006 11:55] Valeriy Kravchuk
Thank you for a problem report. Please, try to use newer version, 5.1.14, and inform about the results. In case of similar problem, please, send error logs from failed nodes.
[26 Dec 2006 11:57] Mitul Savani
Ok fine, I will test it with latest version.

Do you think any configuration problem with above configuration?

Thanks,
[26 Dec 2006 12:11] Valeriy Kravchuk
With that lines in place:

#DataMemory = 1400M
#IndexMemory = 200M

you surely had out-of-memory problem. 

I do not know how much free memory you have on your machines before starting cluster... So, please, do as I asked you in previous comment, and inform about the results.
[27 Dec 2006 8:19] Mitul Savani
Hi,

I have upgraded the cluster veriosn, and receiving the same type of error:

=====================
ndb_mgm> show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3 (not connected, accepting connect from 172.18.1.140)
id=4    @172.18.1.141  (Version: 5.0.27, starting, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @172.18.1.139  (Version: 5.1.14)

[mysqld(API)]   1 node(s)
id=2 (not connected, accepting connect from 172.18.1.138)

ndb_mgm> Node 4: Forced node shutdown completed. Occured during startphase 1. Initiated by signal 0. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'. - Unknown error code: Unknown result: Unknown error code
Node 3: Forced node shutdown completed. Occured during startphase 1. Initiated by signal 0. Caused by error 2308: 'Another node failed during system restart, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'. - Unknown error code: Unknown result: Unknown error code
=====================

Any help would be apreciated!!

Thanks,

Mitul Savani
[27 Dec 2006 9:03] Mitul Savani
Hi,

for your convenient, please find the error log from failed node:

Message: Another node failed during system restart, please investigate error(s) on other node(s) (Restart error)
Error: 2308
Error data: Node 4 disconnected
Error object: QMGR (Line: 2554) 0x0000000e
Program: ndbd
Pid: 2788
Trace: /var/lib/mysql-cluster/ndb_3_trace.log.1
Version: Version 5.1.14 (beta)
***EOM***
                                                                                                       
[root@rhel4mysql3 mysql-cluster]# cat ndb_3_error.log 
Current byte-offset of file-pointer is: 568                       

Time: Wednesday 27 December 2006 - 13:41:12
Status: Temporary error, restart node
Message: Another node failed during system restart, please investigate error(s) on other node(s) (Restart error)
Error: 2308
Error data: Node 4 disconnected
Error object: QMGR (Line: 2554) 0x0000000e
Program: ndbd
Pid: 2788
Trace: /var/lib/mysql-cluster/ndb_3_trace.log.1
Version: Version 5.1.14 (beta)
***EOM***
                                                    

Thanks,

Mitul Savani
[28 Jan 2007 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[24 Mar 2007 3:09] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.1.16, and inform about hte results.
[24 Apr 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".