Bug #43699 ndbmtd crash due to old mgmt server
Submitted: 17 Mar 2009 13:15 Modified: 16 Apr 2009 19:05
Reporter: Guido Ostkamp Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:mysql-5.1-telco-6.* OS:Solaris
Assigned to: Jonas Oreland CPU Architecture:Any

[17 Mar 2009 13:15] Guido Ostkamp
Description:
Hello,

we are using bzr version frazer@mysql.com-20090317011721-fgtlp5cry4rynf6w dated Tue 2009-03-17 01:17:21 +0000 on Solaris 10 sparc.

ndbmtd crashes on startup as follows:

t@5 (l@5) terminated by signal SEGV (no mapping at the fault address)
0xffffffff7de3b670: strlen+0x0050:      ld       [%o2], %o1
Current function is basestring_vsnprintf
   59       int ret= vsnprintf(str, size, format, ap);
(dbx) where
current thread: t@5
  [1] strlen(0x0, 0x53, 0x0, 0x0, 0x0, 0x53), at 0xffffffff7de3b670 
  [2] _ndoprnt(0x1006934bd, 0xffffffff7ccf6fe0, 0xffffffff7dea5fa0, 0x10087a71f, 0x0, 0x1006934bc), at 0xffffffff7dea7d10 
  [3] vsnprintf(0xffffffff7ccf6eac, 0x0, 0x10069347e, 0xffffffff7ccf6fc0, 0x3dfc2, 0xffffffff7ccf76dc), at 0xffffffff7deaa4ec 
=>[4] basestring_vsnprintf(str = 0xffffffff7ccf6eac "Connection attempt from management server id=1 with mysql-5.1.32 ndb-6.4.4 incompatible with my\xff|\xcfp0", size = 96U, format = 0x10069347e "Connection attempt from %s id=%d with %s incompatible with %s%s", ap = 0xffffffff7ccf6fc0), line 59 in "basestring_vsnprintf.c"
  [5] BaseString::vsnprintf(str = 0xffffffff7ccf6eac "Connection attempt from management server id=1 with mysql-5.1.32 ndb-6.4.4 incompatible with my\xff|\xcfp0", size = 96U, format = 0x10069347e "Connection attempt from %s id=%d with %s incompatible with %s%s", ap = 0xffffffff7ccf6fc0), line 468 in "BaseString.cpp"
  [6] SimulatedBlock::infoEvent(this = 0x197698750, msg = 0x10069347e "Connection attempt from %s id=%d with %s incompatible with %s%s", ...), line 1640 in "SimulatedBlock.cpp"
  [7] Qmgr::execAPI_REGREQ(this = 0x197698750, signal = 0xffffffff7ccf7700), line 3137 in "QmgrMain.cpp"
  [8] SimulatedBlock::executeFunction(this = 0x197698750, gsn = 3U, signal = 0xffffffff7ccf7700), line 846 in "SimulatedBlock.hpp"
  [9] execute_signals(selfptr = 0x1007eb318, q = 0x1007ec388, r = 0x1007ec498, sig = 0xffffffff7ccf7700, max_signals = 253890U, signalIdCounter = 0xffffffff7ccf76dc), line 2429 in "mt.cpp"
  [10] run_job_buffers(selfptr = 0x1007eb318, sig = 0xffffffff7ccf7700, signalIdCounter = 0xffffffff7ccf76dc), line 2460 in "mt.cpp"
  [11] mt_job_thread_main(thr_arg = 0x1007eb318), line 2931 in "mt.cpp"
  [12] ndb_thread_wrapper(_ss = 0x100895840), line 145 in "NdbThread.c"
(dbx) frame 7
Current function is Qmgr::execAPI_REGREQ
 3137       infoEvent("Connection attempt from %s id=%d with %s "
(dbx) print extra
extra = (nil)

It seems the printf gets illegal parameters in storage/ndb/src/kernel/blocks/qmgr/QmgrMain.cpp" line 3137

  if (!compatability_check) {
    jam();
    char buf[NDB_VERSION_STRING_BUF_SZ];
    infoEvent("Connection attempt from %s id=%d with %s "
          "incompatible with %s%s",
          type == NodeInfo::API ? "api or mysqld" : "management server",
          apiNodePtr.i,
          ndbGetVersionString(version, mysql_version, 0,
                                  buf,
                                  sizeof(buf)),
          NDB_VERSION_STRING,
              extra ? extra : 0);
    apiNodePtr.p->phase = ZAPI_INACTIVE;
    sendApiRegRef(signal, ref, ApiRegRef::UnsupportedVersion);
    return;
  }

Regards

Guido Ostkamp

How to repeat:
Use old management server that is incompatible with current ndbmtd version and then try to startup ndbmtd.
[18 Mar 2009 7:48] Sveta Smirnova
Thank you for the report.

Which accurate version of old management server do you use?
[18 Mar 2009 8:38] Guido Ostkamp
Our management server was installed on Feb 26 12:19 from a bazaar built,
so most likely it should have been revision id 
tomas.ulin@sun.com-20090225160230-u4guch19txy3gcew dated Thu 2009-02-26 10:09:51 +0100.

Regards

Guido Ostkamp
[18 Mar 2009 8:40] Guido Ostkamp
Sorry, mixed up that date. Correct one is
Wed 2009-02-25 17:02:30 +0100 from

    revno: 2585.42.66
    revision-id: tomas.ulin@sun.com-20090225160230-u4guch19txy3gcew
    parent: frazer@mysql.com-20090225132833-b6amwa2134z392ay
    parent: tomas.ulin@sun.com-20090225160116-1uah23jmubh4cevv
    committer: Tomas Ulin <tomas.ulin@sun.com>
    branch nick: mysql-5.1-telco-6.4
    timestamp: Wed 2009-02-25 17:02:30 +0100
    message:
      merge
[25 Mar 2009 10:35] Jonathan Miller
Workaround, use new version of mgt srv
[16 Apr 2009 9:13] Bugs System
Pushed into 5.1.32-ndb-6.2.18 (revid:jonas@mysql.com-20090416090858-wnwzl0s08gwnt1tp) (version source revid:jonas@mysql.com-20090416090858-wnwzl0s08gwnt1tp) (merge vers: 5.1.32-ndb-6.2.18) (pib:6)
[16 Apr 2009 9:14] Bugs System
Pushed into 5.1.32-ndb-6.3.25 (revid:jonas@mysql.com-20090416091003-yz7qxh53v2z94zbu) (version source revid:jonas@mysql.com-20090416091003-yz7qxh53v2z94zbu) (merge vers: 5.1.32-ndb-6.3.25) (pib:6)
[16 Apr 2009 9:17] Bugs System
Pushed into 5.1.32-ndb-7.0.6 (revid:jonas@mysql.com-20090416091455-nfi0aqj2phi0urnw) (version source revid:jonas@mysql.com-20090416091114-q9ud1m54v5w47o00) (merge vers: 5.1.32-ndb-7.0.6) (pib:6)
[16 Apr 2009 11:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/72272

2909 Jonas Oreland	2009-04-16
      ndb - bug#43699
        Fix so we don't try to sprintf("%s", 0)
[16 Apr 2009 19:05] Jon Stephens
Documented bugfix in the NDB-6.2.18, 6.3.25, and 7.0.6 changelogs as follows:

        When trying to use a data node with an older version of the
        management server, the data node crashed on startup.