MySQL Bugs: #20952: Ndb node shuts down after upgrading from 4.1 to 5.0

Bug #20952	Ndb node shuts down after upgrading from 4.1 to 5.0
Submitted:	10 Jul 2006 21:26	Modified:	22 Aug 2006 8:23
Reporter:	Nikolay Grishakin	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:		OS:	Linux (Linux)
Assigned to:	Stewart Smith	CPU Architecture:	Any
Tags:	Test blocking

Description:
During off-line upgrade from 4.1 to 5.0 cluster I recieved:

2006-07-10 22:42:50 [MgmSrvr] ALERT    -- Node 3: Forced node shutdown completed. Occured during sta
rtphase 4. Initiated by signal 0. Caused by error 2341: 'Internal program error (failed ndbrequire)(
Internal error, programming error or missing error message, please report a bug). Temporary error
2006-07-10 22:42:50 [MgmSrvr] INFO     -- Node 1: Node 2 Connected
2006-07-10 22:42:50 [MgmSrvr] ALERT    -- Node 2: Forced node shutdown completed. Occured during sta
rtphase 4. Initiated by signal 0. Caused by error 2341: 'Internal program error (failed ndbrequire)(
Internal error, programming error or missing error message, please report a bug). Temporary error

On cluster node - ndb_3_error.log has the following info:

[ndbdev@ndb13 ngrishakin]$ cat /space/run/ndb_3_error.log
Current byte-offset of file-pointer is: 568
Time: Monday 10 July 2006 - 22:42:50
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing er
ror message, please report a bug)
Error: 2341
Error data: DbaccMain.cpp
Error object: DBACC (Line: 7810) 0x0000000a
Program: /home/ndbdev/ngrishakin/builds/libexec/ndbd
Pid: 18437
Trace: /space/run/ndb_3_trace.log.1
Version: Version 5.0.22
***EOM***

       ?007372 007372 007372 007372 007372 007372 007372 007372

--------------- Signal ----------------
r.bn: 248 "DBACC", r.proc: 3, r.sigId: 1811637 gsn: 262 "FSREADCONF" prio: 1
s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 1811635 length: 1 trace: 0 #sec: 0 fragInf: 0
 UserPointer: 1
--------------- Signal ----------------
r.bn: 252 "QMGR", r.proc: 3, r.sigId: 1811636 gsn: 164 "CONTINUEB" prio: 0
s.bn: 252 "QMGR", s.proc: 3, s.sigId: 1811634 length: 1 trace: 0 #sec: 0 fragInf: 0
 H'00000004
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 1811635 gsn: 164 "CONTINUEB" prio: 0
s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 1811633 length: 1 trace: 0 #sec: 0 fragInf: 0
 Scanning the memory channel every 10ms
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 1811632 gsn: 164 "CONTINUEB" prio: 1
s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 1811628 length: 1 trace: 0 #sec: 0 fragInf: 0
 Scanning the memory channel again with no delay
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 1811631 gsn: 264 "FSREADREQ" prio: 0
s.bn: 248 "DBACC", s.proc: 3, s.sigId: 1811630 length: 8 trace: 0 #sec: 0 fragInf: 0
 UserPointer: 1
 FilePointer: 65
 UserReference: H'00f80003 Operation flag: H'00000000 (No sync, Format=List of pairs)
 varIndex: 1
 numberOfPages: 1
 pageData:  H'00000010, H'00000000

How to repeat:
Installed 4.0.21 build on each cluster node. Ndb18 – management node, ndb13&14 - ndb nodes.
Root directory for test on each machine is ndbxx:/home/ndbdev/ngrishakin/
Builds are under ndb18:/home/ndbdev/ngrishakin/ud_test/....
To install 4.1.21 copy .../ud_test/4.1.21/builds/ to each machine.
On ndb18:
 [ndbdev@ndb18 ngrishakin]$ . ./shortcut.sh
 [ndbdev@ndb18 ngrishakin]$ ndb_mgmd -f 1nd_50.ini
 [ndbdev@ndb18 ngrishakin]$ mysql_server start

On ndb13&14:
 [ndbdev@ndb1X ngrishakin]$ . ./shortcut.sh 
 [ndbdev@ndb1X ngrishakin]$ ndbd -c ndb18:14000 --initial

On ndb18: 
Run [ndbdev@ndb18 ngrishakin]$ perl load_tpcb_4.pl -so -e ndbcluster 
Shutdown cluster and remove builds directories from each machine.

Installed 5.0.22 the same way from ndb18:/home/ndbdev/ngrishakin/ud_test/5.0.22/builds 

Started: 
[ndbdev@ndb18 ngrishakin]$ ndb_mgmd -f 1nd_50.ini
[ndbdev@ndb18 ngrishakin]$ mysql_server start

[ndbdev@ndb13 ngrishakin]$ ndbd -c ndb18:14000

[ndbdev@ndb14 ngrishakin]$ ndbd -c ndb18:14000

see error in: /space/run/ndb_1_cluster.log

Off-line upgrade works fine from 5.1.11 to 5.1.12

AFAIK we only support backup and restore to upgrade from 4.1 to 5.0.

Was a backup and restore done, or using the same file system?

If same file system, we may want a better error message

This is offline upgrade from 4.1 to 5.0 and 5.0 to 5.1 with an existing database which was created with prev. version. No backup taken.

when upgrading from 4.1 to 5.0 one must do backup/restart initial/restore
if omitting "restart initial" this is not a bug

further more: upgrade with "kept" filesystem is not generally supported
  between _any_ versions (though it might work between some)