Bug #36436 Falcon crash in Cache::fetchPage line 286
Submitted: 30 Apr 2008 17:21 Modified: 20 Oct 2008 13:10
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0.5 OS:Any
Assigned to: Philip Stoev CPU Architecture:Any
Tags: F_PAGE IO

[30 Apr 2008 17:21] Philip Stoev
Description:
When running a test involving DML and server restarts, Falcon crashed as follows. The stack trace implies that a "wrong page type" was to be printed, however none is found in the error log.

#0  0x00110402 in __kernel_vsyscall ()
#1  0x00bdc617 in pthread_kill () from /lib/libpthread.so.0
#2  0x08419397 in write_core (sig=6) at stacktrace.c:305
#3  0x0828a03f in handle_segfault (sig=6) at mysqld.cc:2624
#4  <signal handler called>
#5  0x00110402 in __kernel_vsyscall ()
#6  0x00bdf891 in raise () from /lib/libpthread.so.0
#7  0x084e0416 in Error::debugBreak () at Error.cpp:92
#8  0x084e046e in Error::error (string=0x88655d4 "page %d/%d wrong page type, expected %d got %d\n") at Error.cpp:69
#9  0x08568cf6 in Cache::fetchPage (this=0xb7016c58, dbb=0xb701fe78, pageNumber=45, pageType=PAGE_record_locator, lockType=Exclusive) at Cache.cpp:286
#10 0x084d51b7 in Dbb::fetchPage (this=0xb701fe78, pageNumber=45, pageType=PAGE_record_locator, lockType=Exclusive) at Dbb.cpp:219
#11 0x08537791 in Section::deleteSectionLevel (dbb=0xb701fe78, pageNumber=38, transId=0) at Section.cpp:973
#12 0x08537a16 in Section::deleteSection (dbb=0xb701fe78, sectionId=18, transId=0) at Section.cpp:948
#13 0x0852c92e in SRLDropTable::commit (this=0xb380df48) at SRLDropTable.cpp:102
#14 0x08548b1b in SerialLogTransaction::commit (this=0xb71bce70) at SerialLogTransaction.cpp:91
#15 0x08548e43 in SerialLogTransaction::doAction (this=0xb71bce70) at SerialLogTransaction.cpp:157
#16 0x085721c7 in Gopher::gopherThread (this=0xb7018430) at Gopher.cpp:69
#17 0x085722f6 in Gopher::gopherThread (arg=0xb7018430) at Gopher.cpp:37
#18 0x084a59d3 in Thread::thread (this=0xb7053ee0) at Thread.cpp:161
#19 0x084a5e75 in Thread::thread (parameter=0xb7053ee0) at Thread.cpp:140
#20 0x00bd750b in start_thread () from /lib/libpthread.so.0
#21 0x00b18b2e in clone () from /lib/libc.so.6

How to repeat:
Simplified test case will hopefully follow shortly.
[26 May 2008 11:59] Philip Stoev
Test case for bug 36436

Attachment: bug36436.test (application/octet-stream, text), 16.33 KiB.

[26 May 2008 12:05] Philip Stoev
Please find attached a test case for this bug. Unfortunately, the bug appears to be dependent on some subtile timing, so you may need to run the test case more than once. Further attempts to simplify the test case caused the bug to disappear. 6.0-falcon-team does not appear to be affected. To reproduce:

cd mysql-test
rm -rf var
./mysql-test-run.pl --user=root --skip-ndb --mysqld=--skip-innodb --mysqld=--default-storage-engine=falcon --start-and-exit
./mysql-test-run.pl --socket=var/tmp/master.sock --extern --user=root bug36436
../bin/mysqladmin --no-defaults --user=root --socket=var/tmp/master.sock shutdown
./mysql-test-run.pl --user=root --skip-ndb --mysql=--skip-innodb --start-dirty

The last statement will report that the server can not start. Core fille will be in var/master-data.

Basically this means we start a server, run some queries, and then shutdown the server cleanly. When we start the server again, crash happens. Backtrace is:

#0  0x00110416 in __kernel_vsyscall ()
#1  0x00581c78 in pthread_kill () from /lib/libpthread.so.0
#2  0x083c9d6f in write_core (sig=6) at stacktrace.c:305
#3  0x08275349 in handle_segfault (sig=6) at mysqld.cc:2624
#4  <signal handler called>
#5  0x00110416 in __kernel_vsyscall ()
#6  0x00584fe0 in raise () from /lib/libpthread.so.0
#7  0x0847d03e in Error::debugBreak () at Error.cpp:92
#8  0x0847d096 in Error::error (string=0x87b4368 "page %d/%d wrong page type, expected %d got %d\n") at Error.cpp:69
#9  0x08504ce6 in Cache::fetchPage (this=0xb70aab80, dbb=0xb70b30f0, pageNumber=13, pageType=PAGE_data, lockType=Exclusive) at Cache.cpp:286
#10 0x08471e7f in Dbb::fetchPage (this=0xb70b30f0, pageNumber=13, pageType=PAGE_data, lockType=Exclusive) at Dbb.cpp:219
#11 0x084d346a in Section::storeRecord (this=0xb70b75f8, recordLocatorPage=0xb7098000, indexPageNumber=8, index=0xb7098050, stream=0xbf89f534, transId=92,
    earlyWrite=false) at Section.cpp:674
#12 0x084d59cb in Section::updateRecord (this=0xb70b75f8, recordNumber=7, stream=0xbf89f534, transId=92, earlyWrite=false) at Section.cpp:586
#13 0x08474abd in Dbb::updateRecord (this=0xb70b30f0, sectionId=0, recordId=7, stream=0xbf89f534, transId=92, earlyWrite=false) at Dbb.cpp:348
#14 0x084cd780 in SRLUpdateRecords::redo (this=0xbf89fd94) at SRLUpdateRecords.cpp:298
#15 0x084dce4f in SerialLog::recover (this=0xb73f9ce8) at SerialLog.cpp:358
#16 0x0846d545 in Database::openDatabase (this=0xb72ab618, filename=0xbf8a00c0 "/build/6.0.5/mysql-test/var/master-data/falcon_master.fts")
    at Database.cpp:743
#17 0x08463429 in Connection::getDatabase (this=0xb70aa440, dbName=0xb70aa1fc "FALCON_MASTER",
    dbFileName=0xbf8a00c0 "/build/6.0.5/mysql-test/var/master-data/falcon_master.fts", threads=0xb70aa248) at Connection.cpp:1653
#18 0x08463660 in Connection::openDatabase (this=0xb70aa440, dbName=0xb70aa1fc "FALCON_MASTER", filename=0xb70aa224 "falcon_master.fts",
    account=0x879efd0 "mysql", password=0x879efd0 "mysql", address=0x0, parent=0xb70aa248) at Connection.cpp:931
#19 0x084273c1 in StorageDatabase::getOpenConnection (this=0xb70aa110) at StorageDatabase.cpp:133
#20 0x08429e83 in StorageHandler::initialize (this=0xb72ab020) at StorageHandler.cpp:976
#21 0x08416892 in StorageInterface::falcon_init (p=0x9564c80) at ha_falcon.cpp:204
#22 0x0836dcb3 in ha_initialize_handlerton (plugin=0x9541c88) at handler.cc:428
#23 0x083f79af in plugin_initialize (plugin=0x9541c88) at sql_plugin.cc:1011
#24 0x083fc799 in plugin_init (argc=0x8959860, argv=0x95255d0, flags=0) at sql_plugin.cc:1217
#25 0x08278472 in init_server_components () at mysqld.cc:3962
#26 0x0827b94f in main (argc=31, argv=0xbf8a0764) at mysqld.cc:4394

Please let me know if you require any further information, binaries, tablespaces, etc.
[4 Jun 2008 16:13] Philip Stoev
This bug was observed in 6.0.5 . It is not observed in the current 6.0-falcon-team. Nevertheless, it is a recovery problem which merits further research.
[4 Jun 2008 16:20] Philip Stoev
Philip to check with self-compiled 6.0.5, build-compiled 6.0-falcon-team.
[9 Jun 2008 17:42] Philip Stoev
Could not reproduce with self-compiled debug binary from the 6.0.5 source.
[9 Jun 2008 18:20] Philip Stoev
Could not reproduce with self-compiled non-debug binary from the 6.0.5 source.
[9 Jun 2008 18:43] Philip Stoev
Reproducible on the debug 6.0.5 binary from the build team. Not reproducible on the non-debug 6.0.5 binary from the build team.

The message is "page %d/%d wrong page type, expected %d got %d" in  Cache::fetchPage, with "expected" being PAGE_data and "got" being 11 = PAGE_free.
[9 Jun 2008 19:24] Philip Stoev
Not reproducible with 6.0-falcon-team built with the ./configure line from the build team.  Will test the 6.0.5 rebuild when it arrives.
[14 Jun 2008 8:17] Philip Stoev
Not reproducible on the 6.0.5 rebuild.