Bug #51932 ndbd keep crashing
Submitted: 10 Mar 2010 20:32 Modified: 25 Mar 2010 7:02
Reporter: Rob Tousain Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version: mysql-5.1-telco-6.3 OS:Linux (RedHat)
Assigned to: Pekka Nousiainen CPU Architecture:Any
Tags: mysql-5.1.30 ndb-6.3.20-GA

[10 Mar 2010 20:32] Rob Tousain
Description:
The ndbd crashed and i try to restart it.

Time: Wednesday 10 March 2010 - 20:39:09
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming
 error or missing error message, please report a bug)
Error: 2341
Error data: pgman.cpp
Error object: PGMAN (Line: 1558) 0x0000000a
Program: /usr/sbin/ndbd
Pid: 3426
Trace: /data/mysql/ndb_3_trace.log.25
Version: mysql-5.1.30 ndb-6.3.20-GA
***EOM***

How to repeat:
ndbd

ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @172.19.4.37  (mysql-5.1.30 ndb-6.3.20, Nodegroup: 0, Master)
id=3    @172.19.4.38  (mysql-5.1.30 ndb-6.3.20, starting, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @172.19.4.39  (mysql-5.1.30 ndb-6.3.20)

[mysqld(API)]   6 node(s)
id=4    @172.19.4.37  (mysql-5.1.30 ndb-6.3.20)
id=5    @172.19.4.38  (mysql-5.1.30 ndb-6.3.20)
id=6 (not connected, accepting connect from any host)
id=7 (not connected, accepting connect from any host)
id=8 (not connected, accepting connect from any host)
id=9 (not connected, accepting connect from any host)

ndb_mgm> Node 3: Forced node shutdown completed. Occured during startphase 5. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

ndb_mgm>
[10 Mar 2010 20:44] Rob Tousain
logfile

Attachment: exportMYSQL.zip (application/x-zip-compressed, text), 284.55 KiB.

[10 Mar 2010 21:43] Pekka Nousiainen
Context:

Pgman::fsreadreq(Signal* signal, Ptr<Page_entry> ptr)
{
  File_map::ConstDataBufferIterator it;
  bool ret = m_file_map.first(it) && m_file_map.next(it, ptr.p->m_file_no);
> ndbrequire(ret);

I think this is an open issue (seen very seldom).

Btw ndb-6.3.20 is quite old.  There have been other
fixes since then, including PGMAN fixes.
[11 Mar 2010 8:34] Rob Tousain
Hi Pekka,

any suggestions to get my Nodeid: 3 running again?
Btw  nodeid 2 is still ok and i can not start Nodeid 3... keeps crashing.

Rob
[11 Mar 2010 10:49] Rob Tousain
Hello,

i found that the latest ndbd version is now 6.3.27A   Is this correct?

Can you tell me how to upgrade my 6.3.20 ndbd to 6.3.27A ?

Do i have to upgrade all the packages or is the storage package sufficient?

Currently installed on Data-nodes:
MySQL-Cluster-gpl-server-6.3.20-0.rhel5.i386.rpm
MySQL-Cluster-gpl-client-6.3.20-0.rhel5.i386.rpm
MySQL-Cluster-gpl-storage-6.3.20-0.rhel5.i386.rpm

Currently installed on mgt-node:
MYSQL-Cluster-gpl-management-6.3.20-0.rhel5.rpm
MySQL-Cluster-gpl-tools-6.3.20-0.rhel5.i386.rpm

Rob
[11 Mar 2010 13:03] Pekka Nousiainen
About getting node 3 up:

Did you try ndbd --initial?  This copies all data from
the other node, could take long.

As always make careful backup first.

I don't know much about packages and what upgrade
paths exist for you.  Should be in online ref manual.
Our docs guy is offline today. Maybe some support
engineer can reply here?
[11 Mar 2010 14:52] Rob Tousain
Pekka,

we are running again after the ndbd --initial!

Thank for your advice..... Now we can ga tot our TEST environment to see how to upgrade to 6.3.27A

Rob
[12 Mar 2010 6:54] Pekka Nousiainen
Latest seems to be 6.3.32. Maybe there is no rpms?

Looking just at pgman (which is part of Disk Data),
bug#47832 and bug#48910 were fixed after 6.3.27A.
[23 Mar 2010 14:34] Jørgen Austvik
Did the upgrade solve your problem?
[23 Mar 2010 14:40] Rob Tousain
Hello,

the --initial did solve the problem.

We want to upgrade the ndbd version.
I have 1 question...
Is it possible to upgrade only the ndbd MySQL-Cluster-gpl-storage-6.3.20?
And leave the other packages as is?

We are now running on :
MySQL-Cluster-gpl-storage-6.3.20-0.rhel5.i386.rpm
MySQL-Cluster-gpl-server-6.3.20-0.rhel5.i386.rpm
MySQL-Cluster-gpl-management-6.3.20-0.rhel5.i386.rpm
MySQL-Cluster-gpl-client-6.3.20-0.rhel5.i386.rpm

Rob
[25 Mar 2010 6:30] Pekka Nousiainen
I'm pretty sure all have to be upgraded to same version
(that is ndb kernel, ndb api, ndb mgm, mysql).

The ref manual should have more info.
[25 Mar 2010 6:58] Rob Tousain
Hi Pekka,

thanks for the answers.
You can close this SR now.

Best regards,

Rob Tousain