MySQL Bugs: #37251: Livelock between UPDATE and DELETE threads

Bug #37251	Livelock between UPDATE and DELETE threads
Submitted:	6 Jun 2008 19:26	Modified:	8 Jan 2009 10:34
Reporter:	Philip Stoev	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Falcon storage engine	Severity:	S2 (Serious)
Version:	6.0-falcon-team	OS:	Any
Assigned to:	Vladislav Vaintroub	CPU Architecture:	Any

Description:
When running a concurrent scenario involving transactions, each executing a small number of DELETE and UPDATE operations on a small number of records, the following situation was observed:

* SHOW PROCESSLIST reports all queries are in a state "init".
* The livelock appears to spontaneouly resolve when falcon_lock_wait_timeout expires(?);
* Other queries, including ones on the same table, can proceed;
* top reports that 2 CPU cores are 100% occupied;
* dumping the stalled() threads via gdb reports:

Stalled threads
  Thread 0xb70a8988 (-1450382448) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7d33174
  Thread 0xa94cb560 (-1446794352) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7e85254
  Thread 0xa9375e20 (-1445790832) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa80e47ec
  Thread 0xa94c0430 (-1448686704) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7e33434
  Thread 0xa94c2f60 (-1448084592) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7e6afc4
  Thread 0xa94c2228 (-1448285296) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7ea19f4
  Thread 0xa94c8228 (-1446593648) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7d5be5c
  Thread 0xa94cb2b0 (-1448486000) sleep=1, grant=0, locks=1, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7ea23bc
  Thread 0xa94ccdd0 (-1446392944) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7ea19f4
  Thread 0xa94c0168 (-1446192240) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7ea23bc
  Thread 0xa94c0020 (-1450783856) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7ea19f4
  Thread 0xa94ca520 (-1445991536) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7ea19f4
  Thread 0xa94c3ef0 (-1448887408) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7e33434
  Thread 0xa94c3728 (-1450181744) sleep=1, grant=0, locks=2, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0xa7d33174
Stalled synchronization objects:
  SyncObject a7d33174: state -1, monitor 0, waiters 2
    Exclusive thread a94cb560 (-1446794352), type 2; Transaction::waitForTransaction
    Waiting thread a94c3728 (-1450181744), type 2; Transaction::waitForTransaction
    Waiting thread b70a8988 (-1450382448), type 2; Transaction::waitForTransaction
  SyncObject a7e85254: state -1, monitor 0, waiters 1
    Exclusive thread a9375e20 (-1445790832), type 2; Transaction::waitForTransaction
    Waiting thread a94cb560 (-1446794352), type 2; Transaction::waitForTransaction
  SyncObject a80e47ec: state -1, monitor 0, waiters 1
    Exclusive thread a94c0430 (-1448686704), type 2; Transaction::waitForTransaction
    Waiting thread a9375e20 (-1445790832), type 2; Transaction::waitForTransaction
  SyncObject a7e33434: state -1, monitor 0, waiters 2
    Exclusive thread a94c2f60 (-1448084592), type 2; Transaction::waitForTransaction
    Waiting thread a94c0430 (-1448686704), type 2; Transaction::waitForTransaction
    Waiting thread a94c3ef0 (-1448887408), type 2; Transaction::waitForTransaction
  SyncObject a7e6afc4: state -1, monitor 0, waiters 1
    Exclusive thread a94c2228 (-1448285296), type 2; Transaction::waitForTransaction
    Waiting thread a94c2f60 (-1448084592), type 2; Transaction::waitForTransaction
  SyncObject a7ea19f4: state -1, monitor 0, waiters 4
    Exclusive thread a94c8228 (-1446593648), type 2; Transaction::waitForTransaction
    Waiting thread a94c0020 (-1450783856), type 2; Transaction::waitForTransaction
    Waiting thread a94c2228 (-1448285296), type 2; Transaction::waitForTransaction
    Waiting thread a94ca520 (-1445991536), type 2; Transaction::waitForTransaction
    Waiting thread a94ccdd0 (-1446392944), type 2; Transaction::waitForTransaction
  SyncObject a7d5be5c: state -1, monitor 0, waiters 1
    Exclusive thread a94cb2b0 (-1448486000), type 2; Transaction::waitForTransaction
    Waiting thread a94c8228 (-1446593648), type 2; Transaction::waitForTransaction
  SyncObject a7ea23bc: state -1, monitor 0, waiters 2
    Exclusive thread a94ccdd0 (-1446392944), type 2; Transaction::waitForTransaction
    Waiting thread a94c0168 (-1446192240), type 2; Transaction::waitForTransaction
    Waiting thread a94cb2b0 (-1448486000), type 2; Transaction::waitForTransaction

* gdb reports that two threads are looping in Transaction::getRelativeState, in this loop:

870                     for (Transaction *trans = transaction->waitingFor; trans; trans = trans->waitingFor)
871                             if (trans == this)
872                                     return Deadlock;

However it is impossible to examine trans or step through the loop. Gdb refuses to move on from line 871.

How to repeat:
A core file is available if needed.

* The backtraces are:

#0  0x08419d25 in Transaction::getRelativeState (this=0xa7da4b78, record=0xa903a0f8, flags=<value optimized out>) at Transaction.cpp:870
#1  0x0840e95e in Table::fetchForUpdate (this=0xb74049a8, transaction=0xa7da4b78, source=0xa903a0f8, usingIndex=true) at Table.cpp:3474
#2  0x083fbdf6 in StorageDatabase::nextIndexed (this=0xb709a118, storageTable=0xb62057b8, recordBitmap=0xa7e0a860, recordNumber=7042, lockForUpdate=true)
    at StorageDatabase.cpp:426
#3  0x08400cd4 in StorageTable::nextIndexed (this=0x1, recordNumber=7041, lockForUpdate=<value optimized out>) at StorageTable.cpp:182
#4  0x083f6616 in StorageInterface::index_next (this=0xa834da8, buf=0xa834f70 "Щ") at ha_falcon.cpp:1581
#5  0x083f37bf in StorageInterface::index_next_same (this=0xa834da8, buf=0xa834f70 "Щ", key=0xa5c16a8 "", key_len=5) at ha_falcon.cpp:1634
#6  0x08349d18 in handler::read_range_next (this=0xa834da8) at handler.cc:4943
#7  0x083463f4 in handler::multi_range_read_next (this=0xa834da8, range_info=0xa9839abc) at handler.cc:4241
#8  0x08329cc9 in QUICK_RANGE_SELECT::get_next (this=0xa5b76d0) at opt_range.cc:8508
#9  0x083424bd in rr_quick (info=0xa9839d44) at records.cc:298
#10 0x08344872 in filesort (thd=0xa9b3c9f8, table=0xa647820, sortorder=0xa5b41e8, s_length=1, select=0xa5b4108, max_rows=18446744073709551615,
    sort_positions=true, examined_rows=0xa983a190) at filesort.cc:520
#11 0x082eed83 in mysql_delete (thd=0xa9b3c9f8, table_list=0xa5b3bc8, conds=0xa5b3ec8, order=0xa9b3de68, limit=2, options=<value optimized out>,
    reset_auto_increment=false) at sql_delete.cc:222
#12 0x08267b8a in mysql_execute_command (thd=0xa9b3c9f8) at sql_parse.cc:3082
#13 0x0826d087 in mysql_parse (thd=0xa9b3c9f8, inBuf=0xa5b3ae0 "DELETE FROM B WHERE int_key = 52 ORDER BY RAND ( ) LIMIT 2", length=58,
    found_semicolon=0xa983b31c) at sql_parse.cc:5782
#14 0x0826db3f in dispatch_command (command=COM_QUERY, thd=0xa9b3c9f8, packet=0xa9b3e3b9 "DELETE FROM B WHERE int_key = 52 ORDER BY RAND ( ) LIMIT 2",
    packet_length=58) at sql_parse.cc:1059
#15 0x0826e930 in do_command (thd=0xa9b3c9f8) at sql_parse.cc:732
#16 0x0825f8bd in handle_one_connection (arg=0xa9b3c9f8) at sql_connect.cc:1134
#17 0x0057d32f in start_thread () from /lib/libpthread.so.0
#18 0x0049a27e in clone () from /lib/libc.so.6

#0  0x08419d25 in Transaction::getRelativeState (this=0xa820f448, record=0xa9039eb0, flags=<value optimized out>) at Transaction.cpp:870
#1  0x0840e95e in Table::fetchForUpdate (this=0xb7499268, transaction=0xa820f448, source=0xa9039eb0, usingIndex=false) at Table.cpp:3474
#2  0x083fc16b in StorageDatabase::nextRow (this=0xb709a118, storageTable=0xb61d8bb8, recordNumber=154, lockForUpdate=true) at StorageDatabase.cpp:284
#3  0x08400d0e in StorageTable::next (this=0xb61d8bb8, recordNumber=137, lockForUpdate=<value optimized out>) at StorageTable.cpp:169
#4  0x083f6aa7 in StorageInterface::rnd_next (this=0xa842458, buf=0xa842620 "Ч") at ha_falcon.cpp:578
#5  0x08344795 in filesort (thd=0xa9b592b8, table=0xa589160, sortorder=0xa86e6e8, s_length=1, select=0xa86e608, max_rows=2, sort_positions=true,
    examined_rows=0xa989c188) at filesort.cc:540
#6  0x082e97f1 in mysql_update (thd=0xa9b592b8, table_list=0xa86d190, fields=@0xa9b5a6d0, values=@0xa9b5a8fc, conds=0xa86e278, order_num=1, order=0xa86e3f0,
    limit=2, handle_duplicates=DUP_ERROR, ignore=false) at sql_update.cc:417
#7  0x0826790a in mysql_execute_command (thd=0xa9b592b8) at sql_parse.cc:2844
#8  0x0826d087 in mysql_parse (thd=0xa9b592b8,
    inBuf=0xa86cfb8 "UPDATE C AS X SET int_nokey = ( SELECT pk FROM E AS X WHERE X . int_key = 16 LIMIT 1) , filler = REPEAT ( ' X ' , 81 * 512 ) WHERE X . int_nokey = 42 ORDER BY RAND ( ) LIMIT 2", length=175, found_semicolon=0xa989d31c) at sql_parse.cc:5782
#9  0x0826db3f in dispatch_command (command=COM_QUERY, thd=0xa9b592b8,
    packet=0xa9b5b329 "UPDATE C AS X SET int_nokey = ( SELECT pk FROM E AS X WHERE X . int_key = 16 LIMIT 1) , filler = REPEAT ( ' X ' , 81 * 512 ) WHERE X . int_nokey = 42 ORDER BY RAND ( ) LIMIT 2", packet_length=175) at sql_parse.cc:1059
#10 0x0826e930 in do_command (thd=0xa9b592b8) at sql_parse.cc:732
#11 0x0825f8bd in handle_one_connection (arg=0xa9b592b8) at sql_connect.cc:1134
#12 0x0057d32f in start_thread () from /lib/libpthread.so.0
#13 0x0049a27e in clone () from /lib/libc.so.6

The description suggest a cycle in the Transaction->waitingFor list

Vlad, I think you may be correct here.  This reminds me of the syncUnique you added to Table::insertIndexes().  The check, then add needs to be done serially.  So maybe we should add a syncDeadlock before the loop looking for a deadlock and release it after 
   waitingFor = transaction;

I think the major issue is not that this deadlock is resolved with a timeout rather than with the deadlock detector.

What bothers me is that there are two cores occupied at 100% doing nothing useful for 50 seconds. This a DoS and is bad for the environment.

Setting to Verified since it appears that the main cause is understood. I can still package a repeatable test case if needed -- just let me know.

I think this is another case, like the unique index insert that is best
handled by creating the dependency first, then checking.

Ann, not sure. Certainly that would not be the simplest thing possible.
Yes, there are algorithms for determining cycle in a single linked list ( like "Tortoise and hare" is described here http://en.wikipedia.org/wiki/Cycle_detection ), but dont know about the variation of the algorithm that will work with the list growing of shrinking dynamically while we're checking. 
I'm also afraid that in absense of synchronization, the list traversal might not be OK. I think of the case where current thread sets this->waitingFor to NULL, while concurrent thread got the old value this->waitingFor and goes on with traversal. Are all entries in this list still valid? May there be transactions there that no longer exist, pointers to freed memory and such?

Vlad, I think by moving the search after the link, then one of the threads will find the deadlock.  The problem you described so well happens because threads look for a link to themselves before adding themselves to the linkage.  The problem only occurs when both threads look then link.  Now if they both link then look, one of them will find the deadlock.  What do you think of the following change?

-	for (Transaction *trans = transaction->waitingFor; trans; trans = trans->waitingFor)
-		if (trans == this)
-			return true;

	// OK, add a reference to the transaction to keep the object around,
	// then wait for it to go away

	waitingFor = transaction;
	transaction->addRef();

+	// But if waiting would cause a deadlock, don't try it
+
+	for (Transaction *trans = transaction->waitingFor; trans; trans = trans->waitingFor)
+		if (trans == this)
+			{
+			waitingFor = NULL;
+			transaction->release();
+			return true;
+			}

Kevin , I'm *still* not 100% convinced.
Say one transaction has determined a deadlock like described above, sets its waitingFor to NULL. 
A concurrent thread doing the same loop will not necessarily know about that, because it has read the old value of  this->waitingFor some nanoseconds ago and continues with traversal. 

This way it is possible that concurrent thread will access transaction->waitingFor member , after transaction->release() happens.
What I'm most afraid of, concurrent thread might do it even when transaction variable goes competely out of scope and its memory is released.

Would this be possible or not (if not why not)?

It would not be possible for one of the transactions to be completely freed because there is a shared lock on transactionManager->activeTransactions.syncObject.  

It is possible, however, that instead of neither thread noticing the deadlock and waiting on each other for 50 seconds, that both of them will notice, if the timing is just right.  But this is a small price to pay for not adding a new sync object that is locked exclusively every time one transaction waits on another.

I just had a situation where there is only a single client, and it is blocked (not CPU spinning) in waitForTransaction().

Here is the output from SyncObject::stalled():

Stalled threads
  Thread 0x77e95e8 (1264421184) sleep=1, grant=0, locks=1, who 0, parent=(nil)
    Pending Transaction::waitForTransaction state 0 (2) syncObject 0x2aaaaefac488
Stalled synchronization objects:
  SyncObject 2aaaaefac488: state 0, monitor 0, waiters 0

This situation is of a lesser concern because it will eventually time out and the CPU is not being occupied.

Kevin,
>It would not be possible for one of the transactions to be completely freed because there is a shared lock on transactionManager->activeTransactions.syncObject.

Agreed about Transaction::waitForTransaction(TransId).
Unfortunately, there is no such lock at the moment within Transaction::getRelativeState(), that contains identical deadock detection loop.

Previous comment by Philip suggests that we're waiting for a transaction that does no more exist. I asked Philip specifically about the callstack, he says it happens in getRelativeState.

We probably also should introduce activeTransaction lock there or is there anything else that guarantees only active transaction are present in the waitFor list?

Adding related Bug#37049. 
The callstack attached to it contains getRelativeState() and it has, as it seems invalid transaction object in it.

Vlad, very good find! I think that Transaction::getRelativeState() must also be protected aainst disapearing transactions while it is looking though them.  Sending you by email a possible change to Transaction::getRelativeState and Transaction::waitForTransaction.  Please create a patch with it if you like it.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/47732

2699 Vladislav Vaintroub	2008-06-11
      Bug#37251 - Livelock between UPDATE and DELETE threads
      
      The problem was a bug in deadlock detection code.
      2 threads with mutual dependencies doing deadlock at exactly 
      same time both decide there is no deadlock, add themselves 
      to waitFor list, thus building a cycle in it. Both threads
      will wait for each other until lock wait timeout happens
      (a deadlock). Meanwhile, any other thread that traverses 
      waitFor list may enter the cycle described above and will 
      loop until deadlock is resolved.
      
      Solution here is to modify deadlock detection to add dependency 
      first, than check, then remove dependency if cycle is found. 
      
      Care is taken to prevent compiler optimizations and hopefully
      the combination of volatile specifier and COMPARE_EXHANGE will
      generate a memory barrier to prevent reading elements while 
      they are modified. 
      
      Note, that instead of this fancy solution, plain old SyncObject 
      could be used.  The loop is short, the lock is cheap, 
      but I can't estimate contention for the moment. If current lock-free 
      implementation  turns out to be buggy, this alternative can be 
      considered.
      
      No testcase provided, the situation described in bug report
      can not be reliably reproduced.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/47917

2704 Vladislav Vaintroub	2008-06-16
      Bug#37251 - Livelock between UPDATE and DELETE threads
      The problem was a bug in deadlock detection code.
      2 threads with mutual dependencies doing deadlock at exactly
      same time both decide there is no deadlock, add themselves
      to waitFor list, thus building a cycle in it. Both threads
      will wait for each other until lock wait timeout happens
      (a deadlock). Meanwhile, any other thread that traverses
      waitFor list may enter the cycle described above and will
      loop until deadlock is resolved.
      
      Solution here is to modify deadlock detection to add dependency
      first, than check, then remove dependency if cycle is found.
      Care is taken to prevent compiler optimizations and hopefully
      the combination of volatile specifier and COMPARE_EXHANGE will
      generate a memory barrier to prevent reading elements while
      they are modified.
      
      Note, that instead of this fancy solution, plain old SyncObject
      could be used.  The loop is short, the lock is cheap,
      but I can't estimate contention for the moment. If current lock-free
      implementation  turns out to be buggy, this alternative can be
      considered.
      
      No testcase provided, the situation described in bug report
      can not be reliably reproduced.

This implementation of the fix looks much better.

This fix is in version 6.0.6

A note has been added to the 6.0.6 changelog: 

When running a concurrent scenario involving transactions, each executing a small number of DELETE and UPDATE operations on a small number of records on FALCON tables, a deadlock could occur