Bug #41513 Falcon deadlock Bdb::addRef and Cache::fetchPage
Submitted: 16 Dec 2008 15:16 Modified: 20 Dec 2013 8:18
Reporter: Qi Gao Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version:6.0.8 OS:Any
Assigned to: CPU Architecture:Any

[16 Dec 2008 15:16] Qi Gao
Description:
Hello,

When running the lock unit test with Falcon engine, the server seems entering a deadlock where each of the two threads requesting for a lock held by the other thread.

The backtraces of the involving threads acquiring/requesting locks are pasted below:

Thread 7, acquire Bdb lock (exclusive):
0x851CB0C: Bdb::addRef(LockType) (BDB.cpp:111)
0x84917CC: Cache::fetchPage(Dbb*, int, PageType, LockType) (Cache.cpp:240)
0x84A81DE: Dbb::fetchPage(int, PageType, LockType) (Dbb.cpp:221)
0x84E7BF6: Section::deleteSectionLevel(Dbb*, int, unsigned) (Section.cpp:953)
0x84EA00D: Section::deleteSection(Dbb*, int, unsigned) (Section.cpp:948)
0x850517E: SRLDropTable::commit() (SRLDropTable.cpp:99)
0x84F5C4E: SerialLogTransaction::commit() (SerialLogTransaction.cpp:92)
0x84F5CE4: SerialLogTransaction::doAction() (SerialLogTransaction.cpp:158)
0x8520ECC: Gopher::gopherThread() (Gopher.cpp:71)
0x8520FEC: Gopher::gopherThread(void*) (Gopher.cpp:37)
0x848003B: Thread::thread() (Thread.cpp:167)
0x84801BB: Thread::thread(void*) (Thread.cpp:146)

Thread 15, acquire Cache lock (exclusive):
0x8470F05: Sync::lock(LockType) (Sync.cpp:58)
0x8491787: Cache::fetchPage(Dbb*, int, PageType, LockType) (Cache.cpp:227)
0x84A81DE: Dbb::fetchPage(int, PageType, LockType) (Dbb.cpp:221)
0x84BCC76: IndexRootPage::findRoot(Dbb*, int, int, LockType, unsigned) (IndexRootPage.cpp:334)
0x84BDF35: IndexRootPage::findLeaf(Dbb*, int, int, IndexKey*, LockType, unsigned) (IndexRootPage.cpp:225)
0x84BF810: IndexRootPage::scanIndex(Dbb*, int, int, IndexKey*, IndexKey*, int, unsigned, Bitmap*) (IndexRootPage.cpp:383)
0x84BB143: Index::scanIndex(IndexKey*, IndexKey*, int, Transaction*, Bitmap*) (Index.cpp:472)
0x84C96CA: NBitmap::evalInversion(Nfs::Statement*) (NBitmap.cpp:80)
0x84B49DF: FsbInversion::open(Nfs::Statement*) (FsbInversion.cpp:54)
0x84B4BCA: FsbSieve::open(Nfs::Statement*) (FsbSieve.cpp:49)
0x84D0811: NSelect::evalStatement(Nfs::Statement*) (NSelect.cpp:339)
0x8510083: Nfs::Statement::start(NNode*) (Statement.cpp:488)
0x84D5CA6: PreparedStatement::executeQuery() (PreparedStatement.cpp:99)
0x846D8AE: StorageTableShare::lookupPathName() (StorageTableShare.cpp:781)
0x846DBB2: StorageTableShare::tableExists() (StorageTableShare.cpp:766)
0x846B8A7: StorageHandler::createTable(char const*, char const*, bool) (StorageHandler.cpp:665)
0x8461F04: StorageInterface::create(char const*, TABLE*, st_ha_create_information*) (ha_falcon.cpp:799)
0x834D856: handler::ha_create(char const*, TABLE*, st_ha_create_information*) (handler.cc:3283)
0x834DE04: ha_create_table(THD*, char const*, char const*, char const*, st_ha_create_information*, bool) (handler.cc:3494)

Thread 7, request for Cache lock (shared):
0x8470F05: Sync::lock(LockType) (Sync.cpp:58)
0x8491787: Cache::fetchPage(Dbb*, int, PageType, LockType) (Cache.cpp:227)
0x84A81DE: Dbb::fetchPage(int, PageType, LockType) (Dbb.cpp:221)
0x84D4005: PageInventoryPage::freePage(Dbb*, int, unsigned) (PageInventoryPage.cpp:142)
0x84A77D6: Dbb::freePage(Bdb*, unsigned) (Dbb.cpp:609)
0x84E7DCD: Section::deleteSectionLevel(Dbb*, int, unsigned) (Section.cpp:998)
0x84EA00D: Section::deleteSection(Dbb*, int, unsigned) (Section.cpp:948)
0x850517E: SRLDropTable::commit() (SRLDropTable.cpp:99)
0x84F5C4E: SerialLogTransaction::commit() (SerialLogTransaction.cpp:92)
0x84F5CE4: SerialLogTransaction::doAction() (SerialLogTransaction.cpp:158)
0x8520ECC: Gopher::gopherThread() (Gopher.cpp:71)
0x8520FEC: Gopher::gopherThread(void*) (Gopher.cpp:37)
0x848003B: Thread::thread() (Thread.cpp:167)
0x84801BB: Thread::thread(void*) (Thread.cpp:146)

Thread 15, request for Bdb lock (exclusive):
0x851CB0C: Bdb::addRef(LockType) (BDB.cpp:111)
0x8490BF4: Cache::trialFetch(Dbb*, int, LockType) (Cache.cpp:739)
0x84A86DB: Dbb::trialFetch(int, PageType, LockType) (Dbb.cpp:226)
0x84D4201: PageInventoryPage::validateInventory(Dbb*, Validation*) (PageInventoryPage.cpp:256)
0x84A757E: Dbb::validate(int) (Dbb.cpp:677)
0x847D8D1: TableSpaceManager::validate(int) (TableSpaceManager.cpp:325)
0x84A0EA1: Database::validate(int) (Database.cpp:1692)
0x8499EC8: Connection::validate(int) (Connection.cpp:892)
0x846603C: StorageConnection::validate(int) (StorageConnection.cpp:451)
0x845E60D: StorageInterface::repair(THD*, st_ha_check_opt*) (ha_falcon.cpp:597)
0x834CCF5: handler::ha_repair(THD*, st_ha_check_opt*) (handler.cc:3068)
0x8370507: mysql_admin_table(THD*, TABLE_LIST*, st_ha_check_opt*, char const*, thr_lock_type, bool, bool, unsigned, int (*)(THD*, TABLE_LIST*, st_ha_check_opt*), int (handler::*)(THD*, st_ha_check_opt*), int (*)(THD*, TABLE_LIST*)) (sql_table.cc:4388)
0x8371695: mysql_repair_table(THD*, TABLE_LIST*, st_ha_check_opt*) (sql_table.cc:4625)
0x824FAC1: mysql_execute_command(THD*) (sql_parse.cc:2815)
0x82558C6: mysql_parse(THD*, char const*, unsigned, char const**) (sql_parse.cc:5634)
0x8256872: dispatch_command(enum_server_command, THD*, char*, unsigned) (sql_parse.cc:1009)
0x8257ACF: do_command(THD*) (sql_parse.cc:689)
0x8247319: handle_one_connection (sql_connect.cc:1156)

How to repeat:
It does not happen frequent, I'll try to see whether I can construct a case to trigger it more frequently.
[16 Dec 2008 15:52] Sveta Smirnova
Thank you for the report.

Which lock unit test do you run?
[16 Dec 2008 16:32] Qi Gao
Thanks for taking a look. 

I edited the lock.test file in mysql-test/t (adding a line to set storage engine as falcon), and used mysql-test-run script to run the test lock.test.
[16 Dec 2008 16:44] Sveta Smirnova
Thank you for the feedback.

I can not repeat problem using current development sources on 32-bit Linux. Please indicate your operating system and MySQL package which you use (file name).
[16 Dec 2008 16:57] Qi Gao
I'm using mysql-6.0.8-alpha.tar.gz on a 32-bit Linux OS, Ubuntu 8.04 with 2.6.24 kernel. I compiled the source by using --with-debug --with-pthread --with-plugins=falcon,partition

It depends on timing to reproduce the deadlock. Maybe you can start with the backtrace and try to see whether adding some sleep/yield can help reproducing the deadlock. Thanks a lot!
[23 Dec 2008 18:07] Qi Gao
I've reproduced this with valgrind and created a coredump. I've uploaded a file named bug_data_41513.tar.gz containing the mysqld I compiled and the core file. (The thread ids are not exactly matching the ones in the backtraces I reported earlier.) This core file was generated on a Ubuntu 32-bit OS. Hope it helps.
[20 Dec 2013 8:18] Erlend Dahl
This project has been abandoned.