Bug #38748 | Deadlock in Falcon when running simple SELECT queries | ||
---|---|---|---|
Submitted: | 12 Aug 2008 11:26 | Modified: | 9 Jan 2009 13:58 |
Reporter: | Olav Sandstå | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S3 (Non-critical) |
Version: | 6.0.6 | OS: | Solaris (Solaris 10 on x64) |
Assigned to: | Olav Sandstå | CPU Architecture: | Any |
Tags: | F_TRANSACTION |
[12 Aug 2008 11:26]
Olav Sandstå
[12 Aug 2008 12:20]
Olav Sandstå
Using pstack and attaching with a debugger shows that: 1 thread (the scavenger thread) is block with the following call stack: current thread: t@11 [1] __lwp_park(0x0, 0x0), at 0xfec23dcb [2] cond_sleep_queue(0x90dc090, 0x90dc0a0, 0x0), at 0xfec1e4fc [3] cond_wait_queue(0x90dc090, 0x90dc0a0, 0x0, 0x0), at 0xfec1e60e [4] _cond_wait(0x90dc090, 0x90dc0a0), at 0xfec1eb07 [5] cond_wait(0x90dc090, 0x90dc0a0), at 0xfec1eb49 [6] _pthread_cond_wait(0x90dc090, 0x90dc0a0), at 0xfec1eb82 =>[7] Synchronize::sleep(this = 0x90dc080), line 123 in "Synchronize.cpp" [8] SyncObject::wait(this = 0x90cb24c, type = Exclusive, thread = 0x90dc080, sync = 0xfe1d6d98, timeout = 0), line 661 in "SyncObject.cpp" [9] SyncObject::lock(this = 0x90cb24c, sync = 0xfe1d6d98, type = Exclusive, timeout = 0), line 438 in "SyncObject.cpp" [10] Sync::lock(this = 0xfe1d6d98, type = Exclusive), line 58 in "Sync.cpp" [11] Transaction::commit(this = 0x913a0c0), line 297 in "Transaction.cpp" [12] Connection::commit(this = 0x90c51c0), line 272 in "Connection.cpp" [13] Database::commitSystemTransaction(this = 0x8ec41f0), line 1255 in "Database.cpp" [14] Database::updateCardinalities(this = 0x8ec41f0), line 2394 in "Database.cpp" [15] Database::scavenge(this = 0x8ec41f0), line 1702 in "Database.cpp" [16] Scavenger::scavenge(this = 0x90cac68), line 58 in "Scavenger.cpp" [17] Scavenger::execute(this = 0x90cac68, scheduler = 0x90cabd0), line 68 in "Scavenger.cpp" [18] Scheduler::schedule(this = 0x90cabd0), line 137 in "Scheduler.cpp" [19] Scheduler::schedule(lpParameter = 0x90cabd0), line 159 in "Scheduler.cpp" [20] Thread::thread(this = 0x90dc080), line 167 in "Thread.cpp" [21] Thread::thread(parameter = 0x90dc080), line 146 in "Thread.cpp" [22] _thr_setup(0xfe9d4a00), at 0xfec23a81 [23] _lwp_start(0x0, 0x0, 0x0, 0x90dc0a0, 0xfe9d4a00, 0xfec4d000), at 0xfec23d70 29 threads are block with the following call stack: [1] __lwp_park(0x0, 0x0), at 0xfec23dcb [2] cond_sleep_queue(0x9188818, 0x9188828, 0x0), at 0xfec1e4fc [3] cond_wait_queue(0x9188818, 0x9188828, 0x0, 0x0), at 0xfec1e60e [4] _cond_wait(0x9188818, 0x9188828), at 0xfec1eb07 [5] cond_wait(0x9188818, 0x9188828), at 0xfec1eb49 [6] _pthread_cond_wait(0x9188818, 0x9188828), at 0xfec1eb82 =>[7] Synchronize::sleep(this = 0x9188808), line 123 in "Synchronize.cpp" [8] SyncObject::wait(this = 0x90cb2b8, type = Shared, thread = 0x9188808, sync = 0xfd87ba3c, timeout = 0), line 661 in "SyncObject.cpp" [9] SyncObject::lock(this = 0x90cb2b8, sync = 0xfd87ba3c, type = Shared, timeout = 0), line 438 in "SyncObject.cpp" [10] Sync::lock(this = 0xfd87ba3c, type = Shared), line 58 in "Sync.cpp" [11] TransactionManager::findTransaction(this = 0x90cb1c8, transactionId = 114122U), line 386 in "TransactionManager.cpp" [12] Transaction::releaseDependencies(this = 0x9117ce8), line 775 in "Transaction.cpp" [13] Transaction::commitNoUpdates(this = 0x9117ce8), line 348 in "Transaction.cpp" [14] Transaction::commit(this = 0x9117ce8), line 246 in "Transaction.cpp" [15] Connection::commit(this = 0x90e1bb8), line 272 in "Connection.cpp" [16] StorageConnection::commit(this = 0x90ca370), line 146 in "StorageConnection.cpp" [17] StorageHandler::commit(this = 0x8ec3bd8, mySqlThread = 0x2e99f338), line 268 in "StorageHandler.cpp" [18] StorageInterface::commit(hton = 0x92dea78, thd = 0x2e99f338, all = true), line 1157 in "ha_falcon.cpp" [19] ha_commit_one_phase(thd = 0x2e99f338, all = true), line 1173 in "handler.cc" [20] ha_commit_trans(thd = 0x2e99f338, all = true), line 1142 in "handler.cc" [21] end_trans(thd = 0x2e99f338, completion = COMMIT), line 668 in "sql_parse.cc" [22] mysql_execute_command(thd = 0x2e99f338), line 3933 in "sql_parse.cc" [23] mysql_parse(thd = 0x2e99f338, inBuf = 0x2ebc83f0 "commit", length = 6U, found_semicolon = 0xfd87dd4c), line 5841 in "sql_parse.cc" [24] dispatch_command(command = COM_QUERY, thd = 0x2e99f338, packet = 0x2e9ad5a9 "commit", packet_length = 6U), line 1120 in "sql_parse.cc" [25] do_command(thd = 0x2e99f338), line 807 in "sql_parse.cc" [26] handle_one_connection(arg = 0x2e99f338), line 1153 in "sql_connect.cc" [27] _thr_setup(0xfe9da200), at 0xfec23a81 [28] _lwp_start(0x0, 0x0, 0x0, 0x9188828, 0xfe9da200, 0xfec4d000), at 0xfec23d70 On interesting observation: I run this test with 30 client threads. 29 of them are blocked but it seems that the last one actually is running and is able to make progress (strange...)
[5 Sep 2008 16:38]
Kevin Lewis
Olav, TransactionManager::findTransaction is trying to lock committedTransactions.syncObject. But higher up the stack, in commitNoUpdates(), it already has activeTransactions.syncObject locked. The other thread is in Transaction::commit. There was a time when Transaction::commit locked these two in the opposite order. But now that Transaction::commit locks activeTransactions.syncObject first, and then committedTransactions.syncObject, I do not think this deadlock will occur any more. The follow-on question is whether this code is needed in Transaction::releaseDependencies; if (transaction->transactionId != state->transactionId) { Transaction *transaction = database->transactionManager->findTransaction(state->transactionId); ASSERT(transaction == NULL); } It calls provides a second chance to find the transaction when transaction recycling hits some unspecified race condition. I think you should leave it as long as the deadlock does not happen.
[8 Sep 2008 13:04]
Olav Sandstå
Kevin, thanks for your comments about the deadlock. I think you are right about that this deadlock will no longer occur. When I reported this issue I got it in once out of every three or four runs but for the last few weeks I have not seen it. One comment to your comment about this code in Transaction::releaseDependencies; if (transaction->transactionId != state->transactionId) { Transaction *transaction = database->transactionManager->findTransaction(state->transactionId); ASSERT(transaction == NULL); } I do not agree that this provides a second chance for finding the transaction. Maybe it finds the transaction but given that it stores the transaction pointer in in a very local variable that is not used for anything than potentially asserting it does not provide much. Given that it no longer causes an assert I have no problem leaving the code in.
[8 Sep 2008 15:09]
Kevin Lewis
Woops, you are right. That is a local variable and good for nothing more than asserting. Now I do not see the need for this code.
[10 Oct 2008 12:59]
Olav Sandstå
Although the deadlock no longer happens we have decided to remove this debug code since it requires setting a lock and potentially might participate in future deadlocks.
[10 Oct 2008 18:43]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/56054 2862 Olav Sandstaa 2008-10-10 Fix for Bug#38748 Deadlock in Falcon when running simple SELECT queries Remove debug code that locks the transaction manager. This code have participated in deadlocks.
[14 Oct 2008 15:49]
Kevin Lewis
The latest patch is OK to push. It which removes a call to findTransaction() and asserts that something was found is not needed. if the dependent transaction that we have a pointer to was actually reused and has a different transactionId, we still have a pointer to it which must be set to NULL and that transaction would still have a dependency counter that would need to be decremented. There is no chance that a different transaction would be tracking this transactionID, so findTransaction si not needed. The whole if section can be deleted as it is in the patch.
[28 Oct 2008 8:10]
Bugs System
Pushed into 6.0.8-alpha (revid:olav@sun.com-20081010184244-t25j7c3r1r7uy8lk) (version source revid:olav@sun.com-20081015201310-1eprw26cyb9lrq8v) (pib:5)
[9 Jan 2009 13:58]
MC Brown
A note has been added to the 6.0.8 changelog: Running multiple SELECT operations on the same Falcon table could lead to an assertion within the Transaction::initialize. The same operation could also lead to a deadlock situation on the specified table.