Bug #33634 | Falcon queries starved for a long time | ||
---|---|---|---|
Submitted: | 2 Jan 2008 19:48 | Modified: | 2 May 2008 1:44 |
Reporter: | Philip Stoev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S2 (Serious) |
Version: | 6.0.4-p2 | OS: | Any |
Assigned to: | Christopher Powers | CPU Architecture: | Any |
[2 Jan 2008 19:48]
Philip Stoev
[2 Jan 2008 19:52]
Philip Stoev
Processlist for bug #33634
Attachment: bug33634-processlist.txt (text/plain), 5.22 KiB.
[9 Jan 2008 11:33]
Philip Stoev
falcon_debug_mask log for bug 33634
Attachment: bug33634.err.gz (application/x-gzip, text), 32.29 KiB.
[9 Jan 2008 11:34]
Philip Stoev
I attached a falcon_debug_mask=65535 log from a test run which results in a similar processlist as described in this bug.
[9 Jan 2008 11:34]
Philip Stoev
The errror log contains references to stalled threads and such. I have removed the numerous "commit transaction" messages in order to make the file size manageable.
[9 Jan 2008 12:00]
Philip Stoev
Information_schema dump for bug 33634
Attachment: bug33634-i_s.dump.gz (application/x-gzip, text), 15.31 KiB.
[9 Jan 2008 12:11]
Philip Stoev
Thread stacks for bug 33634
Attachment: bug33634-threads.txt (text/plain), 119.98 KiB.
[9 Jan 2008 12:11]
Philip Stoev
Setting bug to verified -- this bug is reproducable and complete enviroment information has been created and preserved.
[9 Jan 2008 15:38]
Philip Stoev
Instructions for running the systems test are available at https://inside.mysql.com/wiki/SystemQATestPlan a clone of the mysql-test-extra-6.0 tree is required. In order to tailor the tests for Falcon, please edit the iuds2.tst script and replace all references to innodb and myisam with falcon. Then please edit start_server() in the run_systest script to run mysqld with falcon settings: --mysqld="--falcon_record_memory_max=2GB" --mysqld="--max_connections=500" --mysqld="--falcon-lock-wait-timeout=1" --mysqld=--skip-innodb --mysqld=--log-output=TABLE --mysqld=--falcon_debug_mask=65535 The behavoir described in this bug has been observed with both iuds2 and iuds6 tests, however iuds2 is the simpler one. While those tests can possibly be run against a Windows server, the host executing the queries needs to be Linux, so a Linux-only execution environment is recommended. In the process list, the unit of time is seconds, meaning that the problematic queries in quesion have been left behind for almost two hours. I do not think a different value to falcon_wait_lock_timeout would help, since those queries have clearly passed any timeout threshold.
[25 Jan 2008 0:04]
Christopher Powers
Recreated on xeno.mysql.com. Full bt for all threads.
Attachment: bug33634_gdb1.txt.gz (application/x-gzip, text), 48.21 KiB.
[25 Jan 2008 1:44]
Christopher Powers
Same backtrace with deadlocked threads annotated.
Attachment: bug33634_backtrace.txt.gz (application/x-gzip, text), 48.33 KiB.
[25 Jan 2008 1:50]
Christopher Powers
The backtrace indicates that most threads are stuck in the server in open_tables() or close_open_tables(). However, there is a deadlock in Falcon between threads 1293 and 1302, which are closing temporary tables. Thread 1293 has syncTables(exclusive) from Database::dropTable, and is waiting for syncStatements(exclusive) in Database::invalidateCompiledStatements. Thread 1302 has syncStatements(shared) from Database::getCompiledStatement, and is waiting for syncTables(shared). Thread 15, the scavenger thread, is also waiting on a shared lock for syncTables.
[26 Jan 2008 22:53]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/41284 ChangeSet@1.2779, 2008-01-26 16:52:25-06:00, cpowers@xeno.mysql.com +4 -0 Bug#33634, Falcon queries starved for long time. - Moved invalidateCompiledStatements out of syncTables exclusive lock in dropTable to prevent deadlock. - Added assertions to check for corrupt savepoint and table share linked lists.
[30 Jan 2008 5:34]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/41400 ChangeSet@1.2786, 2008-01-29 23:33:38-06:00, cpowers@xeno.mysql.com +1 -0 Bug#33634, Falcon queries starved for a long time - Repositioned the shared lock in StorageHandler::getStorageConnection.
[10 Feb 2008 19:40]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/41995 ChangeSet@1.2810, 2008-02-10 13:39:01-06:00, cpowers@xeno.mysql.com +6 -0 Bug#33634, Falcon queries are starved for a long time Until Falcon supports multiple system transactions, concurrent access to the single system transaction must be serialized at some points, primarily within create and delete table. - Added sync object to protect transaction savepoint lists. - Serialized system transaction access with an exclusive lock on syncSysConnection in Statement::createTable and Table::dropTable. - Enabled oldesActiveSavepointId in Record::scavenge(). Note that care must be taken to ensure that Database::syncTables is acquired before Database::syncSysConnection.
[11 Feb 2008 2:16]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/42014 ChangeSet@1.2811, 2008-02-10 20:15:36-06:00, cpowers@xeno.mysql.com +1 -0 Bug#33634, Falcon queries starved for a long time Disabled assert in SRLUpdateRecords::chill.
[11 Feb 2008 20:46]
Kevin Lewis
Patch is in mysql-6.0-falcon-team and mysql-6.0-release version 6.0.4
[12 Mar 2008 23:03]
Bugs System
Pushed into 6.0.4-alpha
[2 May 2008 1:44]
Paul DuBois
Noted in 6.0.4 changelog. Contention between concurrent Falcon transactions could cause some queries to be starved for a long time.