Bug #47740 Cluster migration 6.3.x->7.0.x: Restarting old SQL nodes crashes old data nodes
Submitted: 30 Sep 2009 11:47 Modified: 2 Oct 2009 11:57
Reporter: Tino Rachui Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:6.3.26 OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any
Tags: cluster online upgrade

[30 Sep 2009 11:47] Tino Rachui
Description:
I have a cluster with two data nodes. I try to (online) upgrade from version 6.3.26 to 7.0.8. When I restart an old SQL node after the management node and the first data node have been upgraded to the new version already to old data node crashes.
This is the ndb_mgm output before I try to restart the old SQL node:
ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @10.16.46.169  (mysql-5.1.35 ndb-7.0.7, Nodegroup: 0)
id=3    @10.16.46.198  (mysql-5.1.35 ndb-6.3.26, Nodegroup: 0, Master)
[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.16.46.178  (mysql-5.1.35 ndb-7.0.7)
[mysqld(API)]   20 node(s)
id=4 (not connected, accepting connect from any host)
id=5    @10.16.46.198  (mysql-5.1.35 ndb-6.3.26)

once I try to start on old SQL node I get:
Node 3: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
Node 3: Forced node shutdown completed. Occured during startphase 5. Caused by error 2301: 'Assertion(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

Afterwards the ndb_mgm output looks like this:

ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @10.16.46.169  (mysql-5.1.35 ndb-7.0.7, Nodegroup: 0, Master)
id=3 (not connected, accepting connect from argus-d46-198-ham)
[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.16.46.178  (mysql-5.1.35 ndb-7.0.7)
[mysqld(API)]   20 node(s)
id=4    @10.16.46.169  (mysql-5.1.35 ndb-6.3.26)
id=5    @10.16.46.198  (mysql-5.1.35 ndb-6.3.26)

The old SQL node obviously started fine.

How to repeat:
Follow the instructions above to reproduce the problem.
[30 Sep 2009 11:48] Tino Rachui
Database log files

Attachment: ndb_error_report_20090910125237.tar.bz2 (application/x-bzip, text), 79.80 KiB.

[30 Sep 2009 12:21] Gustaf Thorslund
Tino,

Could you please verify what version you are trying to upgrade to. You say 7.0.8, but the 'show' output say 7.0.8.

Just had a quick look so far, but it looks a bit similar to bug #47542, see:
 http://bugs.mysql.com/bug.php?id=47542

And that one got fixed to 7.0.8.

/Gustaf
[30 Sep 2009 13:19] Tino Rachui
"Could you please verify what version you are trying to upgrade to. You say 7.0.8, but the 'show' output say 7.0.8..."

To be precise I was trying to upgrade to a (self built) 7.0.7 + patch for #47542 (so to say a pre-7.0.8 ;)). So this really seems to be a different issue. :(
[1 Oct 2009 9:32] Jonas Oreland
The problem is that dbtup sends new format to old DbUtil
Fix is trivial :-(
[1 Oct 2009 9:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85328

3083 Jonas Oreland	2009-10-01
      ndb - bug#47740 - testcase only (cause bug should be fixed in 7.0)
[1 Oct 2009 11:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85340

3063 Jonas Oreland	2009-10-01
      ndb - bug#47740 - fix so that new tup doesnt send long-signals to old dbutil
[1 Oct 2009 11:20] Bugs System
Pushed into 5.1.39-ndb-7.0.9 (revid:jonas@mysql.com-20091001112043-nbbza8h969y169q9) (version source revid:jonas@mysql.com-20091001112043-nbbza8h969y169q9) (merge vers: 5.1.39-ndb-7.0.9) (pib:11)
[1 Oct 2009 13:25] Bugs System
Pushed into 5.1.39-ndb-7.1.0 (revid:jonas@mysql.com-20091001123013-g9ob2tsyctpw6zs0) (version source revid:jonas@mysql.com-20091001123013-g9ob2tsyctpw6zs0) (merge vers: 5.1.39-ndb-7.1.0) (pib:11)
[2 Oct 2009 11:57] Jon Stephens
Documented bugfix in the NDB-7.0.9 changelog as follows:

      During an upgrade, newer nodes could in some cases attempt to 
      use the long signal format for communication with older nodes that 
      did not understand the newer format.

Closed.