Bug #43206 Falcon's cardinality thread crashes during update of cardinalities
Submitted: 25 Feb 2009 20:24 Modified: 26 May 2010 17:52
Reporter: Olav Sandstå Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:6.0.10-alpha OS:Any
Assigned to: Olav Sandstå CPU Architecture:Any
Tags: F_SHUTDOWN

[25 Feb 2009 20:24] Olav Sandstå
Description:
During running of the embedded test suite the following crash of Falcon's cardinality thread has occurred once when running the main.sysdate_is_now test. The call stack looks like this:

(gdb) where
#0  0x000000302420b122 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00000000005b6383 in signal_handler (sig=11) at mysqltest.cc:7427
#2  <signal handler called>
#3  PreparedStatement::executeUpdate (this=0x2aaaab00fbb0)
    at PreparedStatement.cpp:83
#4  0x0000000000a0e1de in Database::updateCardinalities (this=0x2aaaaae0dbd0)
    at Database.cpp:2518
#5  0x0000000000a10108 in Database::cardinalityThreadMain (
    database=<value optimized out>) at Database.cpp:2457
#6  0x0000000000961334 in Thread::thread (this=0x2aaaab0187e8)
    at Thread.cpp:166
#7  0x0000000000961519 in Thread::thread (parameter=0x2aaaab00fbb0)
    at Thread.cpp:145
#8  0x00000030242062f7 in start_thread () from /lib64/libpthread.so.0
#9  0x00000030236d1b6d in clone () from /lib64/libc.so.6

How to repeat:
Seen once while running the embedded test suite:

  ./mysql-test-run.pl --embedded --mem

on Linux.

Suggested fix:
Falcon should not seg.fault.
[25 Feb 2009 20:28] Olav Sandstå
The crash happens in the following code:

int PreparedStatement::executeUpdate()
{
	if (!statement->node)
		throw SQLEXCEPTION (RUNTIME_ERROR, "statement is non-executable");

At this point the statement pointer is null:

(gdb) print statement
$1 = (class CompiledStatement *) 0x0

The PreparedStatement object looks like this:

(gdb) print *this
$3 = {<Nfs::Statement> = {_vptr.Statement = 0xdab2f0, 
    database = 0x2aaaaae0dbd0, parent = 0x0, connection = 0x0, 
    statement = 0x0, parameters = {_vptr.Values = 0xdaf6b0, count = 3, 
      values = 0x2aaaab068368}, values = {_vptr.Values = 0xdaf6b0, count = 3, 
      values = 0x2aaaab0683e8}, slots = 0x0, sorts = 0x0, bitmaps = 0x0, 
    valueSets = 0x0, rowSlots = 0x0, handle = 490, numberContexts = 0, 
    numberSorts = 0, numberBitmaps = 0, numberValueSets = 0, 
    numberRowSlots = 0, transaction = 0x0, resultSets = 0x0, 
    resultLists = 0x0, cursorName = 0x0, currentResultSet = 0, 
    recordsUpdated = 1, tableFilter = 0x0, next = 0x0, prior = 0x0, 
    connectionNext = 0x0, eof = false, active = false, 
    updateStatements = true, special = false, stats = {indexHits = 1, 
      indexFetches = 1, exhaustiveFetches = 0, recordsFetched = 1, 
      recordsReturned = 0, inserts = 0, updates = 1, deletions = 0, 
      replaces = 0, records = 0, recordsSorted = 0}, contexts = 0x0, 
    useCount = 1, javaCount = 0, syncObject = {<SynchronizationObject> = {
        _vptr.SynchronizationObject = 0xd82ab0}, monitorCount = 0, 
      mutex = {<SynchronizationObject> = {
          _vptr.SynchronizationObject = 0xd82ab0}, holder = 0x0, 
        description = 0xd82a37 "SyncObject::mutex", mutex = {__data = {
            __lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = -1, 
            __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
          __size = '\0' <repeats 16 times>, "����", '\0' <repeats 19 times>, 
          __align = 0}}, queue = 0x0, exclusiveThread = 0x0, readers = 0, 
      waiters = 0, lockState = 0, stalls = 0, objectId = 131010, 
      sharedCount = 0, collisionCount = 0, exclusiveCount = 1, waitCount = 0, 
      queueLength = 0, location = 0xdaa465 "Statement::close", 
      name = 0xdc196c "Statement::syncObject"}}, <No data fields>}
[25 Feb 2009 20:31] Olav Sandstå
The call stack for all threads:

(gdb) thread apply all where

Thread 10 (process 12439):
#0  0x000000302420a687 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x0000000000a7c5d6 in Synchronize::sleep (this=0x2aaaab00e0b0, 
    milliseconds=10000, callersMutex=0x2aaaaae38db0) at Synchronize.cpp:183
#2  0x0000000000952efe in SyncObject::wait (this=0x2aaaaae38da0, 
    type=<value optimized out>, thread=0x2aaaab00e0b0, 
    sync=0xffffffffffffffff, timeout=0) at SyncObject.cpp:662
#3  0x0000000000952041 in Sync::lock (this=0x7fff5d24ae80, type=Exclusive)
    at Sync.cpp:58
#4  0x00000000009feaec in Cache::shutdownThreads (this=0x2aaaaae38ab8)
    at Cache.cpp:1054
#5  0x00000000009ff128 in Cache::shutdown (this=0x2aaaaae38ab8)
    at Cache.cpp:1031
#6  0x0000000000a0a9b9 in Database::shutdown (this=0x2aaaaae0dbd0)
    at Database.cpp:1634
#7  0x0000000000a05f0e in Connection::shutdownDatabase (
    this=<value optimized out>) at Connection.cpp:1803
#8  0x00000000009473de in StorageDatabase::close (this=0x2aaaab00e1e8)
    at StorageDatabase.cpp:911
#9  0x000000000094a36a in StorageHandler::shutdownHandler (this=0x2aaaaae0d040)
    at StorageHandler.cpp:194
#10 0x0000000000943166 in StorageInterface::panic (hton=<value optimized out>, 
    flag=1562684896) at ha_falcon.cpp:286
#11 0x000000000084bf2b in ha_finalize_handlerton (plugin=0x1a9c0e88)
    at handler.cc:393
#12 0x000000000066e376 in plugin_deinitialize (plugin=0x1a9c0e88, 
    ref_check=true) at sql_plugin.cc:821
#13 0x000000000066f5be in reap_plugins () at sql_plugin.cc:909
#14 0x000000000066ff0b in plugin_shutdown () at sql_plugin.cc:1516
#15 0x00000000005cf3c7 in end_embedded_server () at ../sql/mysqld.cc:1372
#16 0x00000000005cb5c9 in mysql_server_end () at libmysql.c:202
#17 0x00000000005b8338 in cleanup_and_exit (exit_code=-1426005812)
    at mysqltest.cc:1148
#18 0x00000000005c313a in main (argc=<value optimized out>, argv=0x1)
    at mysqltest.cc:8053

Thread 9 (process 12447):
#0  0x000000302420a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x0000000000a7c702 in Synchronize::sleep (this=0x2aaaab00f028)
    at Synchronize.cpp:123
#2  0x0000000000a013fa in Cache::ioThread (this=0x2aaaaae38ab8)
    at Cache.cpp:987
#3  0x0000000000961334 in Thread::thread (this=0x2aaaab00f028)
    at Thread.cpp:166
#4  0x0000000000961519 in Thread::thread (parameter=0x2aaaab00f044)
    at Thread.cpp:145
#5  0x00000030242062f7 in start_thread () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#6  0x00000030236d1b6d in clone () from /lib64/libc.so.6

Thread 8 (process 12449):
#0  0x000000302420a687 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x0000000000a7c5d6 in Synchronize::sleep (this=0x2aaaab012808, 
    milliseconds=1000, callersMutex=0x0) at Synchronize.cpp:183
#2  0x0000000000a7c43c in Synchronize::sleep (this=0x2aaaab012824, 
    milliseconds=1121054848) at Synchronize.cpp:136
#3  0x0000000000a1019c in Database::ticker (database=<value optimized out>)
    at Database.cpp:1933
#4  0x0000000000961334 in Thread::thread (this=0x2aaaab012808)
    at Thread.cpp:166
#5  0x0000000000961519 in Thread::thread (parameter=0x2aaaab012824)
    at Thread.cpp:145
#6  0x00000030242062f7 in start_thread () from /lib64/libpthread.so.0
#7  0x00000030236d1b6d in clone () from /lib64/libc.so.6

Thread 7 (process 12451):
#0  0x000000302420a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x0000000000a7c702 in Synchronize::sleep (this=0x2aaaab018920)
    at Synchronize.cpp:123
#2  0x0000000000a1023c in Database::scavengerThreadMain (
    database=<value optimized out>) at Database.cpp:1960
#3  0x0000000000961334 in Thread::thread (this=0x2aaaab018920)
    at Thread.cpp:166
#4  0x0000000000961519 in Thread::thread (parameter=0x2aaaab01893c)
    at Thread.cpp:145
#5  0x00000030242062f7 in start_thread () from /lib64/libpthread.so.0
#6  0x00000030236d1b6d in clone () from /lib64/libc.so.6

Thread 6 (process 12453):
#0  0x000000302420a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x0000000000a7c702 in Synchronize::sleep (this=0x2aaaab019098)
    at Synchronize.cpp:123
#2  0x0000000000961402 in Thread::thread (this=0x2aaaab019098)
    at Thread.cpp:192
#3  0x0000000000961519 in Thread::thread (parameter=0x2aaaab0190b4)
    at Thread.cpp:145
#4  0x00000030242062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x00000030236d1b6d in clone () from /lib64/libc.so.6

Thread 5 (process 12454):
#0  0x000000302420a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x0000000000a7c702 in Synchronize::sleep (this=0x2aaaab0191d0)
    at Synchronize.cpp:123
#2  0x0000000000961402 in Thread::thread (this=0x2aaaab0191d0)
---Type <return> to continue, or q <return> to quit---
    at Thread.cpp:192
#3  0x0000000000961519 in Thread::thread (parameter=0x2aaaab0191ec)
    at Thread.cpp:145
#4  0x00000030242062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x00000030236d1b6d in clone () from /lib64/libc.so.6

Thread 4 (process 12455):
#0  0x000000302420a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x0000000000a7c702 in Synchronize::sleep (this=0x2aaaab019308)
    at Synchronize.cpp:123
#2  0x0000000000961402 in Thread::thread (this=0x2aaaab019308)
    at Thread.cpp:192
#3  0x0000000000961519 in Thread::thread (parameter=0x2aaaab019324)
    at Thread.cpp:145
#4  0x00000030242062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x00000030236d1b6d in clone () from /lib64/libc.so.6

Thread 3 (process 12456):
#0  0x000000302420a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x0000000000a7c702 in Synchronize::sleep (this=0x2aaaab019440)
    at Synchronize.cpp:123
#2  0x0000000000961402 in Thread::thread (this=0x2aaaab019440)
    at Thread.cpp:192
#3  0x0000000000961519 in Thread::thread (parameter=0x2aaaab01945c)
    at Thread.cpp:145
#4  0x00000030242062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x00000030236d1b6d in clone () from /lib64/libc.so.6

Thread 2 (process 12457):
#0  0x000000302420a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x0000000000a7c702 in Synchronize::sleep (this=0x2aaaab019578)
    at Synchronize.cpp:123
#2  0x0000000000961402 in Thread::thread (this=0x2aaaab019578)
    at Thread.cpp:192
#3  0x0000000000961519 in Thread::thread (parameter=0x2aaaab019594)
    at Thread.cpp:145
#4  0x00000030242062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x00000030236d1b6d in clone () from /lib64/libc.so.6

Thread 1 (process 12450):
#0  0x000000302420b122 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00000000005b6383 in signal_handler (sig=11) at mysqltest.cc:7427
#2  <signal handler called>
#3  PreparedStatement::executeUpdate (this=0x2aaaab00fbb0)
    at PreparedStatement.cpp:83
#4  0x0000000000a0e1de in Database::updateCardinalities (this=0x2aaaaae0dbd0)
    at Database.cpp:2518
---Type <return> to continue, or q <return> to quit---
#5  0x0000000000a10108 in Database::cardinalityThreadMain (
    database=<value optimized out>) at Database.cpp:2457
#6  0x0000000000961334 in Thread::thread (this=0x2aaaab0187e8)
    at Thread.cpp:166
#7  0x0000000000961519 in Thread::thread (parameter=0x2aaaab00fbb0)
    at Thread.cpp:145
#8  0x00000030242062f7 in start_thread () from /lib64/libpthread.so.0
#9  0x00000030236d1b6d in clone () from /lib64/libc.so.6
(gdb)
[25 Feb 2009 20:44] Olav Sandstå
One initial comment:

The crash in the update cardinality thread occurs during shutdown of Falcon. 
At this point Falcon has closed the serial log and is shutting down the page cache (see thread 10 in the call stacks). I think the cardinality thread should have been shut down earlier in the shutdown process.
[25 Feb 2009 20:58] MySQL Verification Team
I couldn't repeat with below revno. Could you please try latest source tree or provide the exactly server tree you are testing. Thanks in advance.

miguel@hegel:~/bzr/mysql-6.0-build$ bzr revno
2725
miguel@hegel:~/bzr/mysql-6.0-build$ cat /etc/issue
Ubuntu 8.10 \n \l
[25 Feb 2009 21:12] Olav Sandstå
Miguel, thanks for trying to reproduce this bug. I should have added that it is likely not easy to reproduce it. This bug has only shown up once after having run the embedded test suite more or less continually on three linux servers for almost a week.

The source I used was taken from the mysql-6.0-falcon-tree with the following revision number:

olav@nanna29:~/<2>repo/falcon-bug40633-linux2> bzr revno
3034

OS version is:

Red Hat Enterprise Linux Server release 5.2 (Tikanga)

Should I change the state of the bug to analyzing or verified? (I have only seen it once)
[25 Feb 2009 23:22] Olav Sandstå
A second crash of the cardinality thread occurred in another run of the embedded test suite. Although the call stack is slightly different this crash too happened during shutdown of Falcon:

#0  0x000000302420b122 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00000000005b6383 in signal_handler (sig=6) at mysqltest.cc:7427
#2  <signal handler called>
#3  0x000000302420dd4d in raise () from /lib64/libpthread.so.0
#4  0x0000000000a1d652 in Error::error (string=<value optimized out>) at Error.cpp:94
#5  0x0000000000a755c5 in Nfs::Statement::getContext (this=0x2aaaab00fbb0, contextId=0)
    at Statement.cpp:508
#6  0x0000000000b0ba6f in FsbInversion::fetch (this=0x2aaaab022240, statement=0x2aaaab00fbb0)
    at FsbInversion.cpp:62
#7  0x0000000000b0bd2e in FsbSieve::fetch (this=0x2aaaab069140, statement=0x2aaaab00fbb0)
    at FsbSieve.cpp:56
#8  0x0000000000b23aa5 in NUpdate::evalStatement (this=0x2aaaab0708f0, statement=0x2aaaab00fbb0)
    at NUpdate.cpp:135
#9  0x0000000000a76c1c in Nfs::Statement::start (this=0x2aaaab00fbb0, node=0x2aaaab0708f0)
    at Statement.cpp:492
#10 0x0000000000a39036 in PreparedStatement::executeUpdate (this=0x2aaaab00fbb0)
    at PreparedStatement.cpp:86
#11 0x0000000000a0e68e in Database::updateCardinalities (this=0x2aaaaae0dbd0) at Database.cpp:2518
#12 0x0000000000a105b8 in Database::cardinalityThreadMain (database=<value optimized out>)
    at Database.cpp:2457
#13 0x0000000000961734 in Thread::thread (this=0x2aaaab0187e8) at Thread.cpp:166
#14 0x0000000000961919 in Thread::thread (parameter=0x5ec1) at Thread.cpp:145
#15 0x00000030242062f7 in start_thread () from /lib64/libpthread.so.0
#16 0x00000030236d1b6d in clone () from /lib64/libc.so.6

I am setting this to verified but setting fairly low values on how serious this is given that it only happens during shutdown.
[26 Feb 2009 12:36] Olav Sandstå
The dump of the statement object from the second crash:

(gdb) print this
$5 = (Nfs::Statement * const) 0x2aaaab00fbb0
(gdb) print *this
$6 = {_vptr.Statement = 0xdab8f0, database = 0x2aaaaae0dbd0, parent = 0x0,
  connection = 0x0, statement = 0x0, parameters = {_vptr.Values = 0xdafcb0,
    count = 3, values = 0x2aaaab06d0c8}, values = {_vptr.Values = 0xdafcb0,
    count = 3, values = 0x2aaaab06d148}, slots = 0x0, sorts = 0x0,
  bitmaps = 0x0, valueSets = 0x0, rowSlots = 0x0, handle = 490,
  numberContexts = 0, numberSorts = 0, numberBitmaps = 0, numberValueSets = 0,
  numberRowSlots = 0, transaction = 0x0, resultSets = 0x0, resultLists = 0x0,
  cursorName = 0x0, currentResultSet = 0, recordsUpdated = 1,
  tableFilter = 0x0, next = 0x0, prior = 0x0, connectionNext = 0x0,
  eof = false, active = false, updateStatements = true, special = false,
  stats = {indexHits = 1, indexFetches = 1, exhaustiveFetches = 0,
    recordsFetched = 1, recordsReturned = 0, inserts = 0, updates = 1,
    deletions = 0, replaces = 0, records = 0, recordsSorted = 0},
  contexts = 0x0, useCount = 1, javaCount = 0,
  syncObject = {<SynchronizationObject> = {
      _vptr.SynchronizationObject = 0xd830b0}, monitorCount = 0,
    mutex = {<SynchronizationObject> = {
        _vptr.SynchronizationObject = 0xd830b0}, holder = 0x0,
      description = 0xd83037 "SyncObject::mutex", mutex = {__data = {
          __lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = -1,
          __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
        __size = '\0' <repeats 16 times>, "ÿÿÿÿ", '\0' <repeats 19 times>,
        __align = 0}}, queue = 0x0, exclusiveThread = 0x0, readers = 0,
    waiters = 0, lockState = 0, stalls = 0, objectId = 131010,
    sharedCount = 0, collisionCount = 0, exclusiveCount = 1, waitCount = 0,
    queueLength = 0, location = 0xdaaa65 "Statement::close",
    name = 0xdc1f6c "Statement::syncObject"}}
[26 Feb 2009 23:35] Kevin Lewis
Kevin Lewis wrote:  Olav,  Is it related to thread interaction?   

Olav wrote;  I think it is related to thread interaction. Or more specifically, my guess is that we shut down "too quickly" to allow the Update cardinality thread to complete what it is doing so we end up closing resources (eg the prepared statement it uses) while it is still running. So in that kind it is a bit similar to the issue with the Cache I (hopefully) just have fixed.

Feel free to assign it to me but I do not think this is a high priority bug (the only thing that votes for fixing it now is that I seems to have a test configuration setup now that is able to reproduce it if I let it run long enough). I really feel that I should spend my time on fixing issues that can help in the evaluation than "shutdown issues" in embedded test that do not even use Falcon :-)