Bug #43974 ndb_mgmd: SIGSEGV in TransporterRegistry::updateWritePtr() when restarting nodes
Submitted: 31 Mar 2009 7:55 Modified: 26 May 2009 8:11
Reporter: Jan Wedvik Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-7.0 OS:Solaris (Sparc/Solaris 5.10)
Assigned to: Magnus Blåudd CPU Architecture:Any

[31 Mar 2009 7:55] Jan Wedvik
Description:
Stopping and starting nodes causes a segment violation in ndb_mgmd. This error was seen on a Sparc server equipped with Niagara (CMT) processors. The database had two single-threaded data nodes running on one server. There was a load of simple inserts and deletes.

Uint32
TransporterRegistry::updateWritePtr(NodeId node, Uint32 lenBytes, Uint32 prio)
{
  assert(m_use_default_send_buffer);

  SendBuffer *b = m_send_buffers + node;
  SendBufferPage *page = b->m_last_page;
  assert(page != NULL);
  assert(page->m_bytes + lenBytes <= page->max_data_bytes());
### 'page' is nil, so the next line gives SIGSEGV.
  page->m_bytes += lenBytes;
  b->m_used_bytes += lenBytes;
  return b->m_used_bytes;
}

This error appears when running a script that randomly kills and restarts data nodes.

updateWritePtr() is invoked from TransporterRegistry::prepareSend(). The test was then repeated with some extra checks added to prepareSend(), as shown below:

SendStatus
TransporterRegistry::prepareSend(TransporterSendBufferHandle *sendHandle,
                                 const SignalHeader * const signalHeader,
				 Uint8 prio,
				 const Uint32 * const signalData,
				 NodeId nodeId, 
				 const LinearSectionPtr ptr[3]){

  Transporter *t = theTransporters[nodeId];
  if(t != NULL && 
     (((ioStates[nodeId] != HaltOutput) && (ioStates[nodeId] != HaltIO)) || 
      ((signalHeader->theReceiversBlockNumber == 252) ||
       (signalHeader->theReceiversBlockNumber == 4002)))) {
	 
    if(t->isConnected()){
      Uint32 lenBytes = t->m_packer.getMessageLength(signalHeader, ptr);
      if(lenBytes <= MAX_SEND_MESSAGE_BYTESIZE){
	Uint32 * insertPtr = getWritePtr(sendHandle, nodeId, lenBytes, prio);
	if(insertPtr != 0){
#This assert did not trigger.
	  MY_ASSERT(static_cast<TransporterRegistry*>(sendHandle)->m_send_buffers[nodeId].m_last_page!=NULL);
	  usleep(20000);
#But this one did.
	  MY_ASSERT(static_cast<TransporterRegistry*>(sendHandle)->m_send_buffers[nodeId].m_last_page!=NULL);
	  t->m_packer.pack(insertPtr, prio, signalHeader, signalData, ptr);
	  updateWritePtr(sendHandle, nodeId, lenBytes, prio);

As show above, I added two identical asserts, with a usleep between them. The fact that it is the second assert that fails, suggests that there is a race condition with another thread.

Stack dump:
current thread: t@4
  [1] __lwp_kill(0x0, 0x6, 0x0, 0x6, 0xffbffeff, 0x0), at 0x7f5caa58
  [2] raise(0x6, 0x0, 0x7f642fd8, 0x7f5aa378, 0xffffffff, 0x6), at 0x7f565a5c
  [3] abort(0x402ec4, 0x1, 0x2e7420, 0xfcb78, 0x7f6413d8, 0x0), at 0x7f54194c
=>[4] TransporterRegistry::prepareSend(this = 0x3dcc98, sendHandle = 0x3dcc98, signalHeader = 0x7f657e34, prio = '\0', signalData = 0x7f657e50, nodeId = 2U, ptr = (nil)), line 625 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/TransporterRegistry.cpp"
  [5] TransporterRegistry::prepareSend(this = 0x3dcc98, signalHeader = 0x7f657e34, prio = '\0', signalData = 0x7f657e50, nodeId = 2U, ptr = (nil)), line 279 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/include/transporter/TransporterRegistry.hpp"
  [6] TransporterFacade::sendSignalUnCond(this = 0x46c048, aSignal = 0x7f657e34, aNode = 2U, prio = 0), line 1175 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/ndbapi/TransporterFacade.cpp"
  [7] ClusterMgr::threadMain(this = 0x47e048), line 311 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/ndbapi/ClusterMgr.cpp"
  [8] runClusterMgr_C(me = 0x47e048), line 47 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/ndbapi/ClusterMgr.cpp"
  [9] ndb_thread_wrapper(_ss = 0x33c750), line 145 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/portlib/NdbThread.c"

How to repeat:
Restart nodes randomly. With the extra 'usleep' in place, this error seems easy to reproduce.
[31 Mar 2009 11:30] Jan Wedvik
The dbx transcript below indicates where the race condition can be found. Thread t@12 calls TransporterRegistry::reset_send_buffer() while t@9 is executing in TransporterRegistry::prepareSend(). reset_send_buffer() sets m_send_buffers[x].m_last_page to NULL, which would cause segment violation in 
updateWritePtr().

t@12 (l@12) terminated by signal ABRT (Abort)
0x7f5caa58: __lwp_kill+0x0008:  bcc,a,pt  %icc,__lwp_kill+0x18  ! 0x7f5caa68
Current function is JWLock::lock
  100       MY_ASSERT(mTId==voidTid || mTId==self);
(dbx) where
current thread: t@12
  [1] __lwp_kill(0x0, 0x6, 0x0, 0x6, 0xffbffeff, 0x0), at 0x7f5caa58
  [2] raise(0x6, 0x0, 0x7f642fd8, 0x7f5aa378, 0xffffffff, 0x6), at 0x7f565a5c
  [3] abort(0xc, 0x1, 0x2e82c0, 0xfcb78, 0x7f6413d8, 0x0), at 0x7f54194c
=>[4] JWLock::lock(this = 0x3dcdc0), line 100 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/include/transporter/TransporterRegistry.hpp"
  [5] Locker::Locker(this = 0x7f4678e0, reg = 0x3dcc98), line 74 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/TransporterRegistry.cpp"
  [6] TransporterRegistry::reset_send_buffer(this = 0x3dcc98, node = 3U), line 2098 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/TransporterRegistry.cpp"
  [7] TransporterFacade::reset_send_buffer(this = 0x46c048, node = 3U), line 190 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/ndbapi/TransporterFacade.hpp"
  [8] Transporter::connect_server(this = 0x4030a8, sockfd = STRUCT), line 138 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/Transporter.cpp"
  [9] TransporterRegistry::connect_server(this = 0x3dcc98, sockfd = STRUCT), line 402 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/TransporterRegistry.cpp"
  [10] MgmtSrvr::transporter_connect(this = 0x436048, sockfd = STRUCT), line 3705 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/mgmsrv/MgmtSrvr.cpp"
  [11] MgmApiSession::transporter_connect(this = 0xd2f0f8, ctx = CLASS, args = CLASS), line 1712 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/mgmsrv/Services.cpp"
  [12] Parser<MgmApiSession>::run(this = 0xd52350, ctx = CLASS, t = CLASS, stop = (nil)), line 205 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/include/util/Parser.hpp"
  [13] MgmApiSession::runSession(this = 0xd2f0f8), line 387 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/mgmsrv/Services.cpp"
  [14] sessionThread_C(_sc = 0xd2f0f8), line 387 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/util/SocketServer.cpp"
  [15] ndb_thread_wrapper(_ss = 0xd882b0), line 145 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/portlib/NdbThread.c"
(dbx) frame 6
Current function is TransporterRegistry::reset_send_buffer
 2098     Locker scopeLock(this);
(dbx) print this->lock
this->lock = {
    mTId   = 9U
    mDepth = 1
}
(dbx) threads
      t@1  a  l@1   ?()   LWP suspended in  __pollsys()
      t@2  a  l@2   ndb_thread_wrapper()   sleep on 0x351f88  in  __lwp_park()
      t@3  a  l@3   ndb_thread_wrapper()   sleep on 0x351f88  in  __lwp_park()
      t@4  a  l@4   ndb_thread_wrapper()   sleep on 0x351f88  in  __lwp_park()
      t@5  a  l@5   ndb_thread_wrapper()   LWP suspended in  __pollsys()
      t@6  a  l@6   ndb_thread_wrapper()   LWP suspended in  __pollsys()
      t@7  a  l@7   ndb_thread_wrapper()   LWP suspended in  __pollsys()
      t@8  a  l@8   ndb_thread_wrapper()   sleep on 0xd9a248  in  __lwp_park()
      t@9  a  l@9   ndb_thread_wrapper()   LWP suspended in  ___nanosleep()
     t@10  a l@10   ndb_thread_wrapper()   LWP suspended in  __pollsys()
o>   t@12  a l@12   ndb_thread_wrapper()   signal SIGABRT in  __lwp_kill()
(dbx) thread t@9
Current function is TransporterRegistry::prepareSend
  658             usleep(20000);
t@9 (l@9) stopped in ___nanosleep at 0x7f5c9c44
0x7f5c9c44: ___nanosleep+0x0008:        bcc,a,pt  %icc,___nanosleep+0x18       ! 0x7f5c9c54
(dbx) where
current thread: t@9
  [1] ___nanosleep(0x4, 0x0, 0x4e20, 0x25d780, 0x975e0, 0x9c400), at 0x7f5c9c44
  [2] _usleep(0x4e20, 0x7f4c3a00, 0x0, 0x1, 0x1, 0x1), at 0x7f5b971c
=>[3] TransporterRegistry::prepareSend(this = 0x3dcc98, sendHandle = 0x3dcc98, signalHeader = 0x7f487c18, prio = '\001', signalData = 0x7f487c34, nodeId = 2U, ptr = 0x7f487c98), line 658 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/TransporterRegistry.cpp"
  [4] TransporterRegistry::prepareSend(this = 0x3dcc98, signalHeader = 0x7f487c18, prio = '\001', signalData = 0x7f487c34, nodeId = 2U, ptr = 0x7f487c98), line 279 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/include/transporter/TransporterRegistry.hpp"
  [5] SignalSender::sendSignal(this = 0x7f487cc8, nodeId = 2U, s = 0x7f487c18), line 2033 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/ndbapi/TransporterFacade.cpp"
  [6] MgmtSrvr::setEventReportingLevelImpl(this = 0x436048, nodeId_arg = 2, ll = STRUCT), line 1895 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/mgmsrv/MgmtSrvr.cpp"
  [7] MgmtSrvr::logLevelThreadRun(this = 0x436048), line 145 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/mgmsrv/MgmtSrvr.cpp"
  [8] MgmtSrvr::logLevelThread_C(m = 0x436048), line 96 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/mgmsrv/MgmtSrvr.cpp"
  [9] ndb_thread_wrapper(_ss = 0x33c900), line 145 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/portlib/NdbThread.c"
(dbx)
[31 Mar 2009 13:56] Jan Wedvik
The stack traces in the previous comments were misleading, since the two threads work with different data nodes. Below is another example, where both threads access node 2, meaning that the race condition is real.

So far, I have not been able to reproduce this error on Linux. 

(dbx) where
current thread: t@4
  [1] ___nanosleep(0x4, 0x0, 0x4e20, 0x25d780, 0x975e0, 0x9c400), at 0x7f5c9c44
  [2] _usleep(0x4e20, 0x7f4c1200, 0x0, 0x1, 0x0, 0x1), at 0x7f5b971c
=>[3] TransporterRegistry::prepareSend(this = 0x3dcc98, sendHandle = 0x3dcc98, signalHeader = 0x7f657e34, prio = '\0', signalData = 0x7f657e50, nodeId = 2U, ptr = (nil)), line 679 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/TransporterRegistry.cpp"
  [4] TransporterRegistry::prepareSend(this = 0x3dcc98, signalHeader = 0x7f657e34, prio = '\0', signalData = 0x7f657e50, nodeId = 2U, ptr = (nil)), line 279 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/include/transporter/TransporterRegistry.hpp"
  [5] TransporterFacade::sendSignalUnCond(this = 0x46c048, aSignal = 0x7f657e34, aNode = 2U, prio = 0), line 1175 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/ndbapi/TransporterFacade.cpp"
  [6] ClusterMgr::threadMain(this = 0x47e048), line 311 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/ndbapi/ClusterMgr.cpp"
  [7] runClusterMgr_C(me = 0x47e048), line 47 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/ndbapi/ClusterMgr.cpp"
  [8] ndb_thread_wrapper(_ss = 0x33c750), line 145 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/portlib/NdbThread.c"
(dbx) thread t@5
t@5 (l@5) stopped in __lwp_kill at 0x7f5caa58
0x7f5caa58: __lwp_kill+0x0008:  bcc,a,pt  %icc,__lwp_kill+0x18  ! 0x7f5caa68
(dbx) where
current thread: t@5
  [1] __lwp_kill(0x0, 0x6, 0x0, 0x6, 0xffbffeff, 0x0), at 0x7f5caa58
  [2] raise(0x6, 0x0, 0x7f642fd8, 0x7f5aa378, 0xffffffff, 0x6), at 0x7f565a5c
  [3] abort(0x5, 0x1, 0x2e8484, 0xfcb78, 0x7f6413d8, 0x0), at 0x7f54194c
  [4] JWLock::lock(this = 0x426078), line 100 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/include/transporter/TransporterRegistry.hpp"
  [5] Locker::Locker(this = 0x7f637c54, reg = 0x3dcc98, node = 2U), line 92 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/TransporterRegistry.cpp"
=>[6] TransporterRegistry::reset_send_buffer(this = 0x3dcc98, node = 2U), line 2119 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/TransporterRegistry.cpp"
  [7] TransporterFacade::reset_send_buffer(this = 0x46c048, node = 2U), line 190 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/ndbapi/TransporterFacade.hpp"
  [8] Transporter::doDisconnect(this = 0x402ca0), line 288 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/Transporter.cpp"  [9] TransporterRegistry::start_clients_thread(this = 0x3dcc98), line 1640 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/TransporterRegistry.cpp"
  [10] run_start_clients_C(me = 0x3dcc98), line 1379 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/transporter/TransporterRegistry.cpp"
  [11] ndb_thread_wrapper(_ss = 0xcd4148), line 145 in "/log/huge3/jw159207/mysql/trees/telco-7.0/storage/ndb/src/common/portlib/NdbThread.c"
(dbx)
[7 Apr 2009 5:46] Jonas Oreland
Motivation for triage: It can happen to any ndbapi/ndb_mgmd at any given time
  (since it's a race-condition), although risk is low. (I3.5)
[7 Apr 2009 13:25] Jan Wedvik
I tried implementing TransporterFacade::lock_transporter() and
TransporterFacade::unlock_transporter() as:

  void lock_transporter(NodeId node){
    lock_mutex();
  }
  void unlock_transporter(NodeId node){
    unlock_mutex();
  }

I addition I added lock/unlock calls to TransporterFacade::reset_send_buffer as suggested by Magnus Blåudd:
=== modified file 'storage/ndb/src/ndbapi/TransporterFacade.hpp'
--- storage/ndb/src/ndbapi/TransporterFacade.hpp    2009-02-23 09:59:06 +0000
+++ storage/ndb/src/ndbapi/TransporterFacade.hpp    2009-04-02 12:05:46 +0000
@@ -187,7 +187,9 @@ public:
   }
   void reset_send_buffer(NodeId node)
   {
+    lock_mutex();
     theTransporterRegistry->reset_send_buffer(node);
+    unlock_mutex();
   } 

These two changes prevented the race condition described in my comment on [31 Mar 15:56] from happening.

But when I run my test again, I still get SIGSEGV in TransporterRegistry::updateWritePtr() after some time. So some thread must have modified *m_send_buffers without calling TransporterFacade::reset_send_buffer() or locking the mutex by other means.

I then tried to add lock/unlock in MgmtSrvr::start_transporter() and Ndb_cluster_connection_impl::connect(). These methods call 
TransporterFacade::start_instance() which calls TransporterFacade::configure(), which calls TransporterRegistry::allocate_send_buffers() which modifies *m_send_buffers. The diff for these changes are shown below. When running with these changes, I have so far not seen the SIGSEGV error. But I have not tested this enough.

=== modified file 'storage/ndb/src/mgmsrv/MgmtSrvr.cpp'
--- storage/ndb/src/mgmsrv/MgmtSrvr.cpp    2009-03-13 14:24:21 +0000
+++ storage/ndb/src/mgmsrv/MgmtSrvr.cpp        2009-04-06 10:47:44 +0000
@@ -455,15 +455,18 @@
     g_eventLogger->error("Could not create TransporterFacade.");
     DBUG_RETURN(false);
   }
-
+  
+  theFacade->lock_mutex();
   if (theFacade->start_instance(_ownNodeId,
                                 config->m_configValues) < 0)
   {
     g_eventLogger->error("Failed to start transporter");
+    theFacade->unlock_mutex();
     delete theFacade;
     theFacade = 0;
     DBUG_RETURN(false);
   }
+  theFacade->unlock_mutex();

=== modified file 'storage/ndb/src/ndbapi/ndb_cluster_connection.cpp'
--- storage/ndb/src/ndbapi/ndb_cluster_connection.cpp 2009-03-14 20:42:04 +0000
+++ storage/ndb/src/ndbapi/ndb_cluster_connection.cpp  2009-04-07 11:48:39 +0000
@@ -741,11 +741,14 @@
     if(props == 0)
       break;
 
+    m_transporter_facade->lock_transporter(nodeId);
     if (m_transporter_facade->start_instance(nodeId, props) < 0)
     {
+      m_transporter_facade->unlock_transporter(nodeId);
       ndb_mgm_destroy_configuration(props);
       DBUG_RETURN(-1);
     }
+    m_transporter_facade->unlock_transporter(nodeId);
 
     if (init_nodes_vector(nodeId, *props))
     {
[9 Apr 2009 12:23] Magnus Blåudd
Can't really understand why we would need to protect all of 'TransporterFacade::start_instance', feels like deadlock prone in combination with the other patches. Also find it "impossible" for anyone to be using the send buffer, but it might be the arbitrator thread. Need to look if there is anything else.

If it should be protected I think it can be done inside the function.
[8 May 2009 10:19] Magnus Blåudd
We will reset the send buffers when state of transporter goes to CONNECTING, that way it does not need to be locked at all.

Below, I have removed two calls to reset_send_buffer since the send buffer should be emtpy at that time, but to catch any problems with someone writing into the buffer before connected, they should be added back with a "should_be_empty" flag so that we can catch any problem early. Just need to check if those calls are being made from the correct place. I think they should be later in those two function.

Index: 43974/storage/ndb/src/common/transporter/Transporter.cpp
===================================================================
--- 43974.orig/storage/ndb/src/common/transporter/Transporter.cpp	2009-04-14 18:57:50.000000000 +0200
+++ 43974/storage/ndb/src/common/transporter/Transporter.cpp	2009-04-14 19:35:37.000000000 +0200
@@ -134,8 +134,6 @@ Transporter::connect_server(NDB_SOCKET_T
   {
     DBUG_RETURN(false); // TODO assert(0);
   }
-  
-  get_callback_obj()->reset_send_buffer(remoteNodeId);
 
   my_socket_connect_address(sockfd, &m_connect_address);
 
@@ -202,8 +200,6 @@ Transporter::connect_client(NDB_SOCKET_T
   DBUG_PRINT("info",("server port: %d, isMgmConnection: %d",
                      m_s_port, isMgmConnection));
 
-  get_callback_obj()->reset_send_buffer(remoteNodeId);
-
   // Send "hello"
   DBUG_PRINT("info", ("Sending own nodeid: %d and transporter type: %d",
                       localNodeId, m_type));
@@ -285,6 +281,5 @@ Transporter::doDisconnect() {
 
   m_connected= false;
 
-  get_callback_obj()->reset_send_buffer(remoteNodeId);
   disconnectImpl();
 }
Index: 43974/storage/ndb/src/common/transporter/TransporterRegistry.cpp
===================================================================
--- 43974.orig/storage/ndb/src/common/transporter/TransporterRegistry.cpp	2009-04-14 19:35:26.000000000 +0200
+++ 43974/storage/ndb/src/common/transporter/TransporterRegistry.cpp	2009-04-16 12:50:23.000000000 +0200
@@ -1338,6 +1338,9 @@ TransporterRegistry::do_connect(NodeId n
   }
   DBUG_ENTER("TransporterRegistry::do_connect");
   DBUG_PRINT("info",("performStates[%d]=CONNECTING",node_id));
+
+  callbackObj->reset_send_buffer(node_id);
+
   curr_state= CONNECTING;
   DBUG_VOID_RETURN
[12 May 2009 10:23] Magnus Blåudd
Patch sent for review
[15 May 2009 10:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/74176
[15 May 2009 13:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/74213
[15 May 2009 13:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/74218
[15 May 2009 13:49] Bugs System
Pushed into 5.1.34-ndb-7.0.6 (revid:magnus.blaudd@sun.com-20090515134810-n4cjydwoyh15yoft) (version source revid:magnus.blaudd@sun.com-20090515134810-n4cjydwoyh15yoft) (merge vers: 5.1.34-ndb-7.0.6) (pib:6)
[20 May 2009 11:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/74584
[20 May 2009 11:28] Bugs System
Pushed into 5.1.34-ndb-7.0.6 (revid:magnus.blaudd@sun.com-20090520111817-1jru7arp6h4wk0gp) (version source revid:magnus.blaudd@sun.com-20090520111817-1jru7arp6h4wk0gp) (merge vers: 5.1.34-ndb-7.0.6) (pib:6)
[26 May 2009 8:11] Jon Stephens
Documented bugfix in the NDB-7.0.6 changelog as follows:

        Repeated starting and stopping of data nodes could cause
        ndb_mgmd to fail. This issue was observed on Solaris/SPARC.