Bug #33634 Falcon queries starved for a long time
Submitted: 2 Jan 2008 19:48 Modified: 2 May 2008 1:44
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:6.0.4-p2 OS:Any
Assigned to: Christopher Powers CPU Architecture:Any

[2 Jan 2008 19:48] Philip Stoev
Description:
Running the systems test showed interesting readings in SHOW PROCESSLIST.

Some queries appear to take way too long to execute.

I will upload more information as it becomes available.

How to repeat:
Run systems test. More information to follow.
[2 Jan 2008 19:52] Philip Stoev
Processlist for bug #33634

Attachment: bug33634-processlist.txt (text/plain), 5.22 KiB.

[9 Jan 2008 11:33] Philip Stoev
falcon_debug_mask log for bug 33634

Attachment: bug33634.err.gz (application/x-gzip, text), 32.29 KiB.

[9 Jan 2008 11:34] Philip Stoev
I attached a falcon_debug_mask=65535 log from a test run which results in a similar processlist as described in this bug.
[9 Jan 2008 11:34] Philip Stoev
The errror log contains references to stalled threads and such. I have removed the numerous "commit transaction" messages in order to make the file size manageable.
[9 Jan 2008 12:00] Philip Stoev
Information_schema dump for bug 33634

Attachment: bug33634-i_s.dump.gz (application/x-gzip, text), 15.31 KiB.

[9 Jan 2008 12:11] Philip Stoev
Thread stacks for bug 33634

Attachment: bug33634-threads.txt (text/plain), 119.98 KiB.

[9 Jan 2008 12:11] Philip Stoev
Setting bug to verified -- this bug is reproducable and complete enviroment information has been created and preserved.
[9 Jan 2008 15:38] Philip Stoev
Instructions for running the systems test are available at

https://inside.mysql.com/wiki/SystemQATestPlan

a clone of the mysql-test-extra-6.0 tree is required. In order to tailor the tests for Falcon, please edit the iuds2.tst script and replace all references to innodb and myisam with falcon. Then please edit start_server() in the run_systest script to run mysqld with falcon settings:

--mysqld="--falcon_record_memory_max=2GB"
--mysqld="--max_connections=500"
--mysqld="--falcon-lock-wait-timeout=1"
--mysqld=--skip-innodb
--mysqld=--log-output=TABLE
--mysqld=--falcon_debug_mask=65535

The behavoir described in this bug has been observed with both iuds2 and iuds6 tests, however iuds2 is the simpler one. While those tests can possibly be run against a Windows server, the host executing the queries needs to be Linux, so a Linux-only execution environment is recommended.

In the process list, the unit of time is seconds, meaning that the problematic queries in quesion have been left behind for almost two hours. I do not think a different value to falcon_wait_lock_timeout would help, since those queries have clearly passed any timeout threshold.
[25 Jan 2008 0:04] Christopher Powers
Recreated on xeno.mysql.com. Full bt for all threads.

Attachment: bug33634_gdb1.txt.gz (application/x-gzip, text), 48.21 KiB.

[25 Jan 2008 1:44] Christopher Powers
Same backtrace with deadlocked threads annotated.

Attachment: bug33634_backtrace.txt.gz (application/x-gzip, text), 48.33 KiB.

[25 Jan 2008 1:50] Christopher Powers
The backtrace indicates that most threads are stuck in the server in open_tables() or close_open_tables(). However, there is a deadlock in Falcon between threads 1293 and 1302, which are closing temporary tables.

Thread 1293 has syncTables(exclusive) from Database::dropTable, and is waiting for syncStatements(exclusive) in Database::invalidateCompiledStatements.

Thread 1302 has syncStatements(shared) from Database::getCompiledStatement, and is waiting for syncTables(shared).

Thread 15, the scavenger thread, is also waiting on a shared lock for syncTables.
[26 Jan 2008 22:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/41284

ChangeSet@1.2779, 2008-01-26 16:52:25-06:00, cpowers@xeno.mysql.com +4 -0
  Bug#33634, Falcon queries starved for long time.
  - Moved invalidateCompiledStatements out of syncTables exclusive lock in dropTable to prevent deadlock.
  - Added assertions to check for corrupt savepoint and table share linked lists.
[30 Jan 2008 5:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/41400

ChangeSet@1.2786, 2008-01-29 23:33:38-06:00, cpowers@xeno.mysql.com +1 -0
  Bug#33634, Falcon queries starved for a long time
  - Repositioned the shared lock in StorageHandler::getStorageConnection.
[10 Feb 2008 19:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/41995

ChangeSet@1.2810, 2008-02-10 13:39:01-06:00, cpowers@xeno.mysql.com +6 -0
  Bug#33634, Falcon queries are starved for a long time
  
  Until Falcon supports multiple system transactions, concurrent access to the single system
  transaction must be serialized at some points, primarily within create and delete table.  
  - Added sync object to protect transaction savepoint lists.
  - Serialized system transaction access with an exclusive lock on syncSysConnection
    in Statement::createTable and Table::dropTable.
  - Enabled oldesActiveSavepointId in Record::scavenge().
  
  Note that care must be taken to ensure that Database::syncTables is acquired before Database::syncSysConnection.
[11 Feb 2008 2:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/42014

ChangeSet@1.2811, 2008-02-10 20:15:36-06:00, cpowers@xeno.mysql.com +1 -0
  Bug#33634, Falcon queries starved for a long time
  Disabled assert in SRLUpdateRecords::chill.
[11 Feb 2008 20:46] Kevin Lewis
Patch is in mysql-6.0-falcon-team and mysql-6.0-release version 6.0.4
[12 Mar 2008 23:03] Bugs System
Pushed into 6.0.4-alpha
[2 May 2008 1:44] Paul DuBois
Noted in 6.0.4 changelog.

Contention between concurrent Falcon transactions could cause some
queries to be starved for a long time.