Bug #43213 falcon_limit test reports deadlock when locking indexes
Submitted: 26 Feb 9:45 Modified: 26 Feb 9:49
Reporter: Olav Sandstaa
Status: Verified
Category:Server: Falcon Severity:S2 (Serious)
Version:6.0.10-alpha OS:Any
Assigned to: Christopher Powers Target Version:6.0-beta
Tags: test failure, pb2, F_ONLINE ALTER
Triage: Triaged: D1 (Critical)

[26 Feb 9:45] Olav Sandstaa
Description:
falcon_limit test has reported the following error a few times:

  10 stalled queries detected, declaring deadlock at DSN
dbi:mysql:host=127.0.0.1:port=19306:user=root:database=test.

The stack dumps from the core files shows a lot of threads waiting (mostly in server
code) and I only see two user threads that is currently running Falcon code. Both these
have a call stack like this:

Thread 11 (process 9107):
#0  0x00479402 in __kernel_vsyscall ()
#1  0x0089e256 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2  0x0863a001 in Synchronize::sleep (this=0xb70f3da0) at Synchronize.cpp:123
#3  0x0856d0d4 in SyncObject::wait (this=0xb710ce00, type=Exclusive,   
thread=0xb70f3da0, sync=0x0, timeout=0) at SyncObject.cpp:682
#4  0x0856d5f3 in SyncObject::lock (this=0xb710ce00, sync=0x0, type=Exclusive,   
timeout=0) at SyncObject.cpp:449
#5  0x08565dbd in StorageTableShare::lockIndexes (this=0xb71069c0,    exclusiveLock=true)
at StorageTableShare.cpp:182
#6  0x085508ca in StorageInterface::index_init (this=0xae9b7f8, idx=4,    sorted=true) at
ha_falcon.cpp:1506
#7  0x081e0a8f in handler::ha_index_init (this=0xae9b7f8, idx=4, sorted=true)
   at handler.h:1546
#8  0x0833e5f2 in join_read_first (tab=0xabd15b8) at sql_select.cc:17056
#9  0x083411f9 in sub_select (join=0xac4e268, join_tab=0xabd15b8,   
end_of_records=false) at sql_select.cc:16222
#10 0x0834d82b in do_select (join=0xac4e268, fields=0xaba8920, table=0x0,   
procedure=0x0) at sql_select.cc:15786
#11 0x0836825a in JOIN::exec (this=0xac4e268) at sql_select.cc:2881
#12 0x08362e03 in mysql_select (thd=0xaba7558, rref_pointer_array=0xaba8990,   
tables=0xaba2b40, wild_num=1, fields=@0xaba8920, conds=0xaba3a50,    og_num=1,
order=0xaba3c28, group=0x0, having=0x0, proc_param=0x0,    select_options=2148289024,
result=0xaba3cc0, unit=0xaba85f4,    select_lex=0xaba888c) at sql_select.cc:3062
#13 0x0836856a in handle_select (thd=0xaba7558, lex=0xaba8598,    result=0xaba3cc0,
setup_tables_done_option=0) at sql_select.cc:314
#14 0x082cd2a1 in execute_sqlcom_select (thd=0xaba7558, all_tables=0xaba2b40)
   at sql_parse.cc:4757
#15 0x082ce2e4 in mysql_execute_command (thd=0xaba7558) at sql_parse.cc:2063
#16 0x082d6ab7 in mysql_parse (thd=0xaba7558,    inBuf=0xaba2910 "SELECT * FROM E AS X
LEFT JOIN E AS Y ON ( X . `date_key` = Y . `int_key` ) WHERE X . `int_key` < ' w ' ORDER
BY X . `datetime_key` LIMIT 8", length=139, found_semicolon=0x97859e80) at
sql_parse.cc:5752
#17 0x082d7a5b in dispatch_command (command=COM_QUERY, thd=0xaba7558,    packet=0xab96a51
"", packet_length=139) at sql_parse.cc:1009
#18 0x082d8d26 in do_command (thd=0xaba7558) at sql_parse.cc:691
#19 0x082c5e97 in handle_one_connection (arg=0xaba7558) at sql_connect.cc:1146
#20 0x0089a45b in start_thread () from /lib/libpthread.so.0
#21 0x007f1c4e in clone () from /lib/libc.so.6

Both these threads are trying to get an exclusive lock on the
StorageTableShare::syncIndexMap sync object in StorageTableShare::lockIndexes.

How to repeat:
This has happened a few times when running the falcon_limit test.

Suggested fix:
Falcon should not end up in a deadlock situation like this.
[17 Mar 3:52] Kevin Lewis
I put the tag as ONLINE ALTER since the two threads are waiting in Falcon at 
void StorageTableShare::lockIndexes(bool exclusiveLock)
{
	syncIndexMap->lock(NULL, (exclusiveLock) ? Exclusive : Shared);
}

This code was added as part of the Online Alter changes.