Bug #34567 Falcon deadlock between ALTERs of temporary and non-temporary tables
Submitted: 14 Feb 2008 20:59 Modified: 9 Oct 2008 11:39
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version:6.0-falcon-team OS:Any
Assigned to: Christopher Powers CPU Architecture:Any
Triage: D2 (Serious) / R3 (Medium) / E3 (Medium)

[14 Feb 2008 20:59] Philip Stoev
Description:
Falcon deadlocks like this:

Stalled threads
  Thread 0xa9346838 (-1447453808) sleep=0, grant=0, locks=1, who 0, parent=(nil)
    Pending StorageDatabase::renameTable state 0 (1) syncObject 0xb737be30
  Thread 0xa93489a0 (-1449333872) sleep=1, grant=0, locks=1, who 0, parent=(nil)
    Pending Database::findTable state 0 (2) syncObject 0xb737bd28
Stalled synchronization objects:
  SyncObject b737be30: state -1, monitor 0, waiters 1
    Exclusive thread a93489a0 (-1449333872), type 2; Database::findTable
    Waiting thread a9346838 (-1447453808), type 1; StorageDatabase::renameTable
  SyncObject b737bd28: state -1, monitor 0, waiters 1
    Exclusive thread a9346838 (-1447453808), type 1; StorageDatabase::renameTable
    Waiting thread a93489a0 (-1449333872), type 2; Database::findTable

between those two threads:

#0  0x00110402 in __kernel_vsyscall ()
#1  0x00bdb5d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2  0x0852bbb5 in Synchronize::sleep (this=0xa93489a0) at Synchronize.cpp:119
#3  0x08494c5e in SyncObject::wait (this=0xb737bd28, type=Shared, thread=0xa93489a0, sync=0xa99cbcb0, timeout=0) at SyncObject.cpp:406
#4  0x08494feb in SyncObject::lock (this=0xb737bd28, sync=0xa99cbcb0, type=Shared, timeout=0) at SyncObject.cpp:259
#5  0x08493fae in Sync::lock (this=0xa99cbcb0, type=Shared) at Sync.cpp:58
#6  0x084bfe7e in Database::findTable (this=0xb737b628, schema=0xb74ca998 "SYSTEM", name=0xb74caae0 "FORMATS") at Database.cpp:981
#7  0x084b8dc1 in Connection::findTable (this=0xb717bbb0, name=0xb74caae0 "FORMATS") at Connection.cpp:1160
#8  0x084b55ba in CompiledStatement::validate (this=0xb71ba1f0, connection=0xb717bbb0) at CompiledStatement.cpp:795
#9  0x084bfd36 in Database::getCompiledStatement (this=0xb737b628, connection=0xb717bbb0, sqlString=0x8839c98 "delete from Formats where tableId=?")
    at Database.cpp:1131
#10 0x084f18cd in PreparedStatement::setSqlString (this=0xb72711b8, sqlString=0x8839c98 "delete from Formats where tableId=?") at PreparedStatement.cpp:58
#11 0x084c1982 in Database::prepareStatement (this=0xb737b628, connection=0xb717bbb0, sqlStr=0x8839c98 "delete from Formats where tableId=?")
    at Database.cpp:1095
#12 0x084b99e4 in Connection::prepareStatement (this=0xb717bbb0, sqlString=0x8839c98 "delete from Formats where tableId=?") at Connection.cpp:231
#13 0x084bd1d6 in Database::prepareStatement (this=0xb737b628, sqlStr=0x8839c98 "delete from Formats where tableId=?") at Database.cpp:1058
#14 0x0849c1fb in Table::drop (this=0xa92e9968, transaction=0x0) at Table.cpp:1510
#15 0x084bf155 in Database::dropTable (this=0xb737b628, table=0xa92e9968, transaction=0x0) at Database.cpp:1415
#16 0x08527cc5 in Nfs::Statement::dropTable (this=0xb718c5a8, syntax=0xa92f6df0) at Statement.cpp:1135
#17 0x0852a248 in Nfs::Statement::executeDDL (this=0xb718c5a8) at Statement.cpp:823
#18 0x0852a571 in Nfs::Statement::execute (this=0xb718c5a8, sqlString=0xa99cc17c "drop table TMP.\"#SQL624B_221_1\"", isQuery=false) at Statement.cpp:2696
#19 0x0852a606 in Nfs::Statement::execute (this=0xb718c5a8, sqlString=0xa99cc17c "drop table TMP.\"#SQL624B_221_1\"") at Statement.cpp:161
#20 0x0848ce0f in StorageDatabase::deleteTable (this=0xb717a128, storageConnection=0xb7352ab0, tableShare=0xa92e8b10) at StorageDatabase.cpp:488
#21 0x0849197c in StorageTableShare::deleteTable (this=0xa92e8b10, storageConnection=0xb7352ab0) at StorageTableShare.cpp:138
#22 0x08490826 in StorageTable::deleteTable (this=0xb7522068) at StorageTable.cpp:81
#23 0x08486ae2 in StorageInterface::delete_table (this=0x96bcde0, tableName=0x9988308 "/build/mysql-6.0-falcon-team/mysql-test/var/tmp/#sql624b_221_1")
    at ha_falcon.cpp:949
#24 0x082d366f in rm_temporary_table (base=0x8e30030, path=0x9988308 "/build/mysql-6.0-falcon-team/mysql-test/var/tmp/#sql624b_221_1", frm_only=false)
    at sql_base.cc:5372
#25 0x082d37cf in close_temporary (table=0x9987a00, free_share=true, delete_table=true) at sql_base.cc:1926
#26 0x082d398f in close_temporary_table (thd=0xa9a098f0, table=0x9987a00, free_share=true, delete_table=true) at sql_base.cc:1898
#27 0x083aa144 in mysql_alter_table (thd=0xa9a098f0, new_db=0x96bcad8 "test", new_name=0x96bc8c0 "t1", create_info=0xa99cd890, table_list=0x96bc8e8,
    alter_info=0xa99cdd50, order_num=0, order=0x0, ignore=false) at sql_table.cc:6734
#28 0x0829f442 in mysql_execute_command (thd=0xa9a098f0) at sql_parse.cc:2393
#29 0x0829ff5c in mysql_parse (thd=0xa9a098f0, inBuf=0x96bc838 "ALTER TABLE t1 ENGINE = innodb", length=30, found_semicolon=0xa99ce2e4) at sql_parse.cc:5410
#30 0x082a0ed9 in dispatch_command (command=COM_QUERY, thd=0xa9a098f0, packet=0xa9a63551 "ALTER TABLE t1 ENGINE = innodb", packet_length=30)
    at sql_parse.cc:921
#31 0x082a1f73 in do_command (thd=0xa9a098f0) at sql_parse.cc:697
#32 0x08291b1a in handle_one_connection (arg=0xa9a098f0) at sql_connect.cc:1146
#33 0x00bd750b in start_thread () from /lib/libpthread.so.0
#34 0x00b18b2e in clone () from /lib/libc.so.6

and

#0  0x00110402 in __kernel_vsyscall ()
#1  0x00bdb5d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2  0x0852bbb5 in Synchronize::sleep (this=0xa9346838) at Synchronize.cpp:119
#3  0x08494c5e in SyncObject::wait (this=0xb737be30, type=Exclusive, thread=0xa9346838, sync=0xa9b96ad4, timeout=0) at SyncObject.cpp:406
#4  0x08494feb in SyncObject::lock (this=0xb737be30, sync=0xa9b96ad4, type=Exclusive, timeout=0) at SyncObject.cpp:259
#5  0x08493fae in Sync::lock (this=0xa9b96ad4, type=Exclusive) at Sync.cpp:58
#6  0x0848b9a4 in StorageDatabase::renameTable (this=0xb717a128, storageConnection=0xb73518b0, table=0xb722aa08, tableName=0xa9349694 "#SQL2-624B-21C",
    schemaName=0xa934a7b4 "TEST") at StorageDatabase.cpp:690
#7  0x0849286d in StorageTableShare::renameTable (this=0xa92e5628, storageConnection=0xb73518b0, newName=0xa9b971db "./test/#sql2-624b-21c")
    at StorageTableShare.cpp:223
#8  0x0848634f in StorageInterface::rename_table (this=0x948e9a0, from=0xa9b973db "./test/inter1", to=0xa9b971db "./test/#sql2-624b-21c")
    at ha_falcon.cpp:1461
#9  0x0839e5d3 in mysql_rename_table (base=0x8e30030, old_db=0x953ff50 "test", old_name=0x953fd38 "inter1", new_db=0x953ff50 "test",
    new_name=0xa9b984cc "#sql2-624b-21c", flags=2) at sql_table.cc:3706
#10 0x083aa318 in mysql_alter_table (thd=0xa9ae8e20, new_db=0x953ff50 "test", new_name=0x953fd38 "inter1", create_info=0xa9b98890, table_list=0x953fd60,
    alter_info=0xa9b98d50, order_num=0, order=0x0, ignore=false) at sql_table.cc:6802
#11 0x0829f442 in mysql_execute_command (thd=0xa9ae8e20) at sql_parse.cc:2393
#12 0x0829ff5c in mysql_parse (thd=0xa9ae8e20, inBuf=0x953fca0 "ALTER TABLE inter1 ADD KEY k1 (t1_uuid)", length=39, found_semicolon=0xa9b992e4)
    at sql_parse.cc:5410
#13 0x082a0ed9 in dispatch_command (command=COM_QUERY, thd=0xa9ae8e20, packet=0xa9a36b21 "ALTER TABLE inter1 ADD KEY k1 (t1_uuid)", packet_length=39)
    at sql_parse.cc:921
#14 0x082a1f73 in do_command (thd=0xa9ae8e20) at sql_parse.cc:697
#15 0x08291b1a in handle_one_connection (arg=0xa9ae8e20) at sql_connect.cc:1146
#16 0x00bd750b in start_thread () from /lib/libpthread.so.0
#17 0x00b18b2e in clone () from /lib/libc.so.6

How to repeat:
Test case will follow shortly.
[14 Feb 2008 21:07] Philip Stoev
Test case for bug 34567

Attachment: bug34567.zip (application/x-zip-compressed, text), 727 bytes.

[14 Feb 2008 21:09] Philip Stoev
Please run the attached test case as follows:

1. Place the .txt file from the archive into mysql-test and the .test files into mysql-test/t.

2. Run:

perl ./mysql-test-run.pl --stress --stress-test-file=bug34567_run.txt --stress-test-duration=120 --stress-threads=5 --skip-ndb --mysqld=--falcon_debug_mask=65535 --mysqld=--log-output=file

And you will observe a hang within 10 seoconds. By default, the server logs to a table -- if the deadlock happens when logging is set to "TABLE", no new mysql connections are accepted and the entire server becomes unavailable. That is why this example uses --log-output=file so that you can still log into the server.
[15 Feb 2008 5:47] Kevin Lewis
After initial analysis, Chris suspects that this is due to recent changes in how syncSysConnection is locked.  This was for Bug 33634. Assigning to Chris.
[20 Feb 2008 21:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/42711

ChangeSet@1.2819, 2008-02-20 15:16:05-06:00, cpowers@xeno.mysql.com +5 -0
  Bug#34567 - Falcon deadlock between ALTERs of temporary and non-temporary tables
  
  Added Database::syncDDL to serialize DDL operations.
[3 Mar 2008 18:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/43331

ChangeSet@1.2586, 2008-03-03 12:40:17-06:00, cpowers@xeno.mysql.com +24 -0
  Bug#34567, "Falcon deadlock between ALTERs of temporary tables
  
  System QA stress tests caused deadlocks during DDL operations.
  Moved DDL locks out of low-level classes.
  Replaced Database::syncSysConnection with Database::syncSysDDL.
  Exclusively lock all DDL via Statement::executeDDL.
[4 Mar 2008 3:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/43349

ChangeSet@1.2587, 2008-03-03 21:17:42-06:00, cpowers@xeno.mysql.com +1 -0
  Bug#34567, "Falcon deadlock between ALTERS of temporary tables"
  
  Added debugging code to Statement::start.
[12 Mar 2008 23:02] Bugs System
Pushed into 6.0.4-alpha
[13 Mar 2008 15:58] Hakan Küçükyılmaz
Fix is in 6.0.5-alpha.
[9 Oct 2008 11:39] Jon Stephens
Documented in the 6.0.5 changelog as follows:

        Concurrent ALTER TABLE operations on temporary and non-temporary Falcon
        tables caused the server to hang.