Bug #38609 Segfault in Logger::Log causes ndbd to hang indefinately
Submitted: 6 Aug 2008 17:52 Modified: 8 Oct 2008 12:33
Reporter: Matthew Montgomery Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:5.1.23-ndb-6.2.15-community OS:Linux (2.6.18-53.el5PAE)
Assigned to: Magnus Blåudd CPU Architecture:Any

[6 Aug 2008 17:52] Matthew Montgomery
Description:
From back trace of hung ndbd process.

Thread 30 (Thread -149505136 (LWP 17193)):
#0  0xffffe402 in __kernel_vsyscall ()
#1  0x0040ba1e in __lll_mutex_lock_wait () from /lib/libpthread.so.0
#2  0x00407870 in _L_mutex_lock_85 () from /lib/libpthread.so.0
#3  0x004073bd in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0x08290496 in NdbMutex_Lock ()
#5  0x082809b2 in Logger::log ()
#6  0x08280b06 in Logger::warning ()
#7  0x08263a05 in WatchDog::run ()
#8  0x08263a81 in runWatchDog ()
#9  0x08290ae0 in NdbThread_set_shm_sigmask ()
#10 0x0040543b in start_thread () from /lib/libpthread.so.0
#11 0x0038bfde in clone () from /lib/libc.so.6

Thread 1 (Thread -134302016 (LWP 17191)):
#0  0xffffe402 in __kernel_vsyscall ()
#1  0x0040ba1e in __lll_mutex_lock_wait () from /lib/libpthread.so.0
#2  0x00407870 in _L_mutex_lock_85 () from /lib/libpthread.so.0
#3  0x004073bd in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0x08290496 in NdbMutex_Lock ()
#5  0x082809b2 in Logger::log ()
#6  0x08280ac6 in Logger::info ()
#7  0x080a8150 in handler_error ()
#8  <signal handler called>
#9  0x0032ae0b in strlen () from /lib/libc.so.6
#10 0x002fd1ef in vfprintf () from /lib/libc.so.6
#11 0x0039fb31 in __vsnprintf_chk () from /lib/libc.so.6
#12 0x082a0e17 in basestring_vsnprintf ()
#13 0x0829c512 in BaseString::vsnprintf ()
#14 0x08280a0b in Logger::log ()
#15 0x08280b46 in Logger::error ()
#16 0x08226ea3 in Suma::execSUB_GCP_COMPLETE_REP ()
#17 0x0814c25a in Dbdih::execSUB_GCP_COMPLETE_REP ()
#18 0x0825e8e1 in FastScheduler::doJob ()
#19 0x0825f64e in ThreadConfig::ipControlLoop ()
#20 0x080a8784 in main ()
#0  0xffffe402 in __kernel_vsyscall ()

How to repeat:
.

Suggested fix:
.
[6 Aug 2008 18:02] Magnus Blåudd
Segfault in Logger::log causes the 'handler_error' function to generate tracefile and print even more using Logger::log. Since the first call to Logger::log has locked the mutex, it will deadlock indefinitley.
[6 Aug 2008 18:06] Magnus Blåudd
Originating line is in 'Suma::execSUB_GCP_COMPLETE_REP'

  if(m_gcp_complete_rep_count && !c_subscriber_nodes.isclear())
  {
    CRASH_INSERTION(13033);

    NodeReceiverGroup rg(API_CLUSTERMGR, c_subscriber_nodes);
    sendSignal(rg, GSN_SUB_GCP_COMPLETE_REP, signal,
	       SubGcpCompleteRep::SignalLength, JBB);
    
    Ptr<Gcp_record> gcp;
    if(c_gcp_list.seize(gcp))
    {
      gcp.p->m_gci = gci;
      gcp.p->m_subscribers = c_subscriber_nodes;
    }
    else
    {
      char buf[100];
      c_subscriber_nodes.getText(buf);
      g_eventLogger->error("c_gcp_list.seize() failed: gci: %d nodes: %s",
                           gci, buf);
      ^^ Crash HERE bacues the string in "buf" has not terminated by '\0'.
         why that happens should be investigated! Setting the buf size to 255
         does not help.
    }
  }
[7 Aug 2008 12:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/51091

2639 Magnus Svensson	2008-08-07
      Bug#38609 Segfault in Logger::Log causes ndbd to hang indefinately
       - Part 1, fix the cause
[2 Sep 2008 9:49] Magnus Blåudd
Pushed to MySQL Cluster 6.2, 6.3 and 6.4
[11 Sep 2008 19:48] Jon Stephens
Documented bugfix in the NDB 6.2.16 and 6.3.17 changelogs as follows:

        A segfault in Logger::Log caused ndbd to hang indefinitely.
[5 Oct 2008 16:31] Jon Stephens
Already documented; closed.
[8 Oct 2008 12:33] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html
[12 Dec 2008 23:25] Bugs System
Pushed into 6.0.6-alpha  (revid:msvensson@mysql.com-20080807115914-lm3tzcpdneakxeaj) (version source revid:jonas@mysql.com-20080813092004-7zlf6eu87i4ziwm2) (pib:5)