Bug #20952 Ndb node shuts down after upgrading from 4.1 to 5.0
Submitted: 10 Jul 2006 21:26 Modified: 22 Aug 2006 8:23
Reporter: Nikolay Grishakin Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version: OS:Linux (Linux)
Assigned to: Stewart Smith CPU Architecture:Any
Tags: Test blocking

[10 Jul 2006 21:26] Nikolay Grishakin
Description:
During off-line upgrade from 4.1 to 5.0 cluster I recieved:

2006-07-10 22:42:50 [MgmSrvr] ALERT    -- Node 3: Forced node shutdown completed. Occured during sta
rtphase 4. Initiated by signal 0. Caused by error 2341: 'Internal program error (failed ndbrequire)(
Internal error, programming error or missing error message, please report a bug). Temporary error
2006-07-10 22:42:50 [MgmSrvr] INFO     -- Node 1: Node 2 Connected
2006-07-10 22:42:50 [MgmSrvr] ALERT    -- Node 2: Forced node shutdown completed. Occured during sta
rtphase 4. Initiated by signal 0. Caused by error 2341: 'Internal program error (failed ndbrequire)(
Internal error, programming error or missing error message, please report a bug). Temporary error

On cluster node - ndb_3_error.log has the following info:

[ndbdev@ndb13 ngrishakin]$ cat /space/run/ndb_3_error.log
Current byte-offset of file-pointer is: 568
Time: Monday 10 July 2006 - 22:42:50
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing er
ror message, please report a bug)
Error: 2341
Error data: DbaccMain.cpp
Error object: DBACC (Line: 7810) 0x0000000a
Program: /home/ndbdev/ngrishakin/builds/libexec/ndbd
Pid: 18437
Trace: /space/run/ndb_3_trace.log.1
Version: Version 5.0.22
***EOM***

       ?007372 007372 007372 007372 007372 007372 007372 007372

--------------- Signal ----------------
r.bn: 248 "DBACC", r.proc: 3, r.sigId: 1811637 gsn: 262 "FSREADCONF" prio: 1
s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 1811635 length: 1 trace: 0 #sec: 0 fragInf: 0
 UserPointer: 1
--------------- Signal ----------------
r.bn: 252 "QMGR", r.proc: 3, r.sigId: 1811636 gsn: 164 "CONTINUEB" prio: 0
s.bn: 252 "QMGR", s.proc: 3, s.sigId: 1811634 length: 1 trace: 0 #sec: 0 fragInf: 0
 H'00000004
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 1811635 gsn: 164 "CONTINUEB" prio: 0
s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 1811633 length: 1 trace: 0 #sec: 0 fragInf: 0
 Scanning the memory channel every 10ms
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 1811632 gsn: 164 "CONTINUEB" prio: 1
s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 1811628 length: 1 trace: 0 #sec: 0 fragInf: 0
 Scanning the memory channel again with no delay
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 1811631 gsn: 264 "FSREADREQ" prio: 0
s.bn: 248 "DBACC", s.proc: 3, s.sigId: 1811630 length: 8 trace: 0 #sec: 0 fragInf: 0
 UserPointer: 1
 FilePointer: 65
 UserReference: H'00f80003 Operation flag: H'00000000 (No sync, Format=List of pairs)
 varIndex: 1
 numberOfPages: 1
 pageData:  H'00000010, H'00000000

How to repeat:
Installed 4.0.21 build on each cluster node. Ndb18 – management node, ndb13&14 - ndb nodes.
Root directory for test on each machine is ndbxx:/home/ndbdev/ngrishakin/
Builds are under ndb18:/home/ndbdev/ngrishakin/ud_test/....
To install 4.1.21 copy .../ud_test/4.1.21/builds/ to each machine.
On ndb18:
 [ndbdev@ndb18 ngrishakin]$ . ./shortcut.sh
 [ndbdev@ndb18 ngrishakin]$ ndb_mgmd -f 1nd_50.ini
 [ndbdev@ndb18 ngrishakin]$ mysql_server start

On ndb13&14:
 [ndbdev@ndb1X ngrishakin]$ . ./shortcut.sh 
 [ndbdev@ndb1X ngrishakin]$ ndbd -c ndb18:14000 --initial

On ndb18: 
Run [ndbdev@ndb18 ngrishakin]$ perl load_tpcb_4.pl -so -e ndbcluster 
Shutdown cluster and remove builds directories from each machine.

Installed 5.0.22 the same way from ndb18:/home/ndbdev/ngrishakin/ud_test/5.0.22/builds 

Started: 
[ndbdev@ndb18 ngrishakin]$ ndb_mgmd -f 1nd_50.ini
[ndbdev@ndb18 ngrishakin]$ mysql_server start

[ndbdev@ndb13 ngrishakin]$ ndbd -c ndb18:14000

[ndbdev@ndb14 ngrishakin]$ ndbd -c ndb18:14000

see error in: /space/run/ndb_1_cluster.log
[10 Jul 2006 23:25] Nikolay Grishakin
Off-line upgrade works fine from 5.1.11 to 5.1.12
[18 Jul 2006 5:17] Stewart Smith
AFAIK we only support backup and restore to upgrade from 4.1 to 5.0.

Was a backup and restore done, or using the same file system?

If same file system, we may want a better error message
[18 Jul 2006 5:22] Nikolay Grishakin
This is offline upgrade from 4.1 to 5.0 and 5.0 to 5.1 with an existing database which was created with prev. version. No backup taken.
[22 Aug 2006 8:23] Jonas Oreland
when upgrading from 4.1 to 5.0 one must do backup/restart initial/restore
if omitting "restart initial" this is not a bug

further more: upgrade with "kept" filesystem is not generally supported
  between _any_ versions (though it might work between some)