Bug #39321 | Falcon deadlock between Table::retireRecords and Database::retireRecords | ||
---|---|---|---|
Submitted: | 8 Sep 2008 15:58 | Modified: | 9 Jan 2009 14:13 |
Reporter: | Philip Stoev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S1 (Critical) |
Version: | 6.0-falcon-team | OS: | Any |
Assigned to: | Kevin Lewis | CPU Architecture: | Any |
[8 Sep 2008 15:58]
Philip Stoev
[8 Sep 2008 16:00]
Philip Stoev
Stacks for bug 39321
Attachment: bug39321.stacks.txt (text/plain), 50.50 KiB.
[8 Sep 2008 19:02]
Kevin Lewis
This deadlock can happen when a Truncate command runs out of memory and has to call Database::forceRecordScavenge(). Any other thread that calls it at the same time can get into a deadlock with it because it locks Table::syncObject before Database::syncScavenge wherease most other threads will get Database::syncScavenge before Table::syncObject. Thread 13 Database::truncateTable(4) (Table::syncObject) -> ... Table::allocRecord -> Database::forceRecordScavenge -> Database::retireRecords (Database::syncScavenge) Thread 9 ... Record::allocRecordData -> Database::forceRecordScavenge -> Database::retireRecords (Database::syncScavenge) Table::retireRecords (Table::syncObject) I think the solution is for the Database::truncateTable to also lock Database::syncScavenge before it gets started. It is already locking these; Database::truncateTable(1) Database::syncSysDDL Database::truncateTable(2) Database::syncTables Database::truncateTable(3) SerialLog::syncSections Database::truncateTable(4) Table::syncObject
[11 Sep 2008 4:00]
Kevin Lewis
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/53743 2824 Kevin Lewis 2008-09-10 Bug#39321 Add an exclusive lock on Database::syncScavenge in Database::truncateTable before the lock of Table::syncObject just in case the truncateTable process has to call Database::forceRecordScavenge. syncScavenge must be locked before Table::syncObject because the scavenger does it that way. According to the Deadlock Detector, syncScavenge must also be locked before Database::syncTables.
[12 Sep 2008 16:57]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/53990 2819 Vladislav Vaintroub 2008-09-12 Bug#39321 - messages in recovery about exceptions from ReadFile. Ignore ERROR_HANDLE_EOF coming from ReadFile() It is end of file and read should just return 0 like it does in Posix case.
[30 Sep 2008 18:17]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/54804 2843 Kevin Lewis 2008-09-30 Bug#39321 Add an exclusive lock on Database::syncScavenge in Database::truncateTable before the lock of Table::syncObject just in case the truncateTable process has to call Database::forceRecordScavenge. syncScavenge must be locked before Table::syncObject because the scavenger does it that way. According to the Deadlock Predictor (SyncHandler.cpp), syncScavenge must also be locked before Database::syncTables.
[30 Sep 2008 18:19]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/54805 2843 Kevin Lewis 2008-09-30 Bug#39321 Add an exclusive lock on Database::syncScavenge in Database::truncateTable before the lock of Table::syncObject just in case the truncateTable process has to call Database::forceRecordScavenge. syncScavenge must be locked before Table::syncObject because the scavenger does it that way. According to the Deadlock Predictor (SyncHandler.cpp), syncScavenge must also be locked before Database::syncTables.
[9 Jan 2009 14:13]
MC Brown
A note has been added to the 6.0.8 changelog: When running TRUNCATE on a table where other threads are also trying to access the same Falcon table, a deadlock could occur between the two executing threads