Bug #39404 Core in NdbEventBuffer::deleteUsedEventOperations()
Submitted: 11 Sep 2008 19:28 Modified: 26 Sep 2008 18:36
Reporter: Jonathan Miller
Status: Closed
Category:Server: ClusterRep Severity:S2 (Serious)
Version:mysql-5.1-telco-6.2 OS:Linux
Assigned to: Target Version:
Triage: D1 (Critical)

[11 Sep 2008 19:28] Jonathan Miller
Description:
I was running a set of TPC-B test using Cluster Replication setup.

During the 4th run while the test was dropping the database the mysqld caught a sig 11.

Back Trace follows:

#0  0x0000003907e0b142 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000627121 in handle_segfault (sig=11) at mysqld.cc:2511
#2  <signal handler called>
#3  NdbEventBuffer::deleteUsedEventOperations (this=0x45d5e60)
    at NdbEventOperationImpl.cpp:1432
#4  0x000000000091d4ad in NdbEventBuffer::nextEvent (this=0x45d5e60)
    at NdbEventOperationImpl.cpp:1363
#5  0x00000000007cdda2 in ndb_binlog_thread_func (arg=<value optimized out>)
    at ha_ndbcluster_binlog.cc:5334
#6  0x0000003907e06307 in start_thread () from /lib64/libpthread.so.0
#7  0x00000039072d1ded in clone () from /lib64/libc.so.6
(gdb) f 3
#3  NdbEventBuffer::deleteUsedEventOperations (this=0x45d5e60)
    at NdbEventOperationImpl.cpp:1432

Error log shows:

/data0/cr_autotest/libexec/mysqld(my_print_stacktrace+0x39)[0x96b739]
/data0/cr_autotest/libexec/mysqld(handle_segfault+0x31d)[0x6270ed]
/lib64/libpthread.so.0[0x3907e0de80]
/data0/cr_autotest/libexec/mysqld(_ZN14NdbEventBuffer25deleteUsedEventOperationsEv+0x35)[0
x919485]
/data0/cr_autotest/libexec/mysqld(_ZN14NdbEventBuffer9nextEventEv+0x1ed)[0x91d4ad]
/data0/cr_autotest/libexec/mysqld(ndb_binlog_thread_func+0xdf2)[0x7cdda2]
/lib64/libpthread.so.0[0x3907e06307]
/lib64/libc.so.6(clone+0x6d)[0x39072d1ded]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at (nil) is an invalid pointer
thd->thread_id=1
thd->killed=NOT_KILLED

How to repeat:
Not sure how repeatable this is as I have only hit it once.
[24 Sep 2008 14:25] Tomas Ulin
Bug can appear during replication.  It can manifest itself in many ways.
[24 Sep 2008 14:26] Tomas Ulin
In a release build it will typically mean that a table stops logging.

In a debug build there will be an assetion and a core
[25 Sep 2008 14:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/54518

2670 Tomas Ulin	2008-09-24
      Bug #39404  	Core in NdbEventBuffer::deleteUsedEventOperations()
[25 Sep 2008 14:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/54519

2670 Tomas Ulin	2008-09-24
      Bug #39404  	Core in NdbEventBuffer::deleteUsedEventOperations()
[25 Sep 2008 23:12] Tomas Ulin
fixed in next 6.2 and forward
[26 Sep 2008 18:36] Jon Stephens
Documented in the 5.1.28-ndb-6.2.16 and 5.1.28-ndb-6.3.18 changelogs as follows:

        In some cases, dropping a database on the master could cause table
        logging to fail on the slave, or, when using a debug build, could cause
        the slave mysqld to fail completely.
[13 Dec 2008 0:26] Bugs System
Pushed into 6.0.7-alpha  (revid:tomas.ulin@sun.com-20080924122711-7bozt1e2q10cszhp)
(version source revid:jonas@mysql.com-20080925105539-wd6gbofp5alv9j93) (pib:5)