Bug #43554 Intermittent lock-wait-timeout in falcon_deadlock test
Submitted: 11 Mar 2009 9:16 Modified: 26 May 2010 17:47
Reporter: Kevin Lewis Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version:6.0-falcon-team OS:Any
Assigned to: Kevin Lewis CPU Architecture:Any
Tags: F_SYNCHRONIZATION

[11 Mar 2009 9:16] Kevin Lewis
Description:
falcon_team.falcon_deadlock.test has continued to fail intermittently even after Bug#34182 was fixed. A very similar script to the one in falcon_deadlock was added in falcon.falcon_bug_34182.test, which does not fail.

The similar test in falcon_deadlock now fails differently than it did when Bug#34182 was opened.  Most often, this test gets a lock wait timeout.  It seems to be using the default of 50 seconds.  

How to repeat:
Run falcon_deadlock.test multiple times

Suggested fix:
Find out why the wait occurs.  Adjust the test if neccessary.
[11 Mar 2009 9:42] Kevin Lewis
Verified in Pushbuild
[12 Jun 2009 14:57] Kevin Lewis
The other bug fix for Bug#34182 only partially solved the problem.  It was ussumed at that point that if a record is locked recursively within the same transaction, that each lock would be at different savepoints.  But that is not the case in falcon_deadlock.

The testcase for 34182 is like this;

DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t2;
CREATE TABLE t1 (id INTEGER primary key, x INTEGER) ENGINE falcon;
CREATE TABLE t2 (b integer, a integer) ENGINE falcon;
INSERT INTO t1 VALUES (1, 1), (222, 222);
INSERT INTO t2 VALUES (1, 10), (2, 20), (3, 30);
SELECT * FROM t1;
SELECT * FROM t2;
BEGIN;
UPDATE t2 SET a = 100 WHERE b = (SELECT x FROM t1 WHERE id = b FOR UPDATE);

The final UPDATE with a subselect is the focus of these bugs.
Since T1, column id is a primary key, it could be locked directly FOR UPDATE after doing the WHERE from the index.  

Here is a chart of the records;
RecordId   id     x     b     a 
  T1-0      1     1    
  T1-1    222   222
  T2-0                  1    10
  T2-1                  2    20
  T2-2                  3    30

The following list shows the record version changes that occur to the base record as pointer to by the record tree. 

Insert T1-0 Trans=29 Old=(trans=0,SP=0) New=Inserted(trans=29,SP=0)
Insert T1-1 Trans=29 Old=(trans=0,SP=0) New=Inserted(29,0)
Insert T2-0 Trans=31 Old=(trans=0,SP=0) New=Inserted(31,0)
Insert T2-1 Trans=31 Old=(trans=0,SP=0) New=Inserted(31,0)
Insert T2-2 Trans=31 Old=(trans=0,SP=0) New=Inserted(31,0)

Lock   T2-0 TransId=35 Old=Inserted(31,0) New=Locked(35,1)
Lock   T1-0 TransId=35 Old=Inserted(29,0) New=Locked(35,1)
Lock   T2-1 TransId=35 Old=Inserted(31,0) New=Locked(35,1)
Unlock T2-1 TransId=35 Old=Locked(35,1) New=Inserted(31,0)
Lock   T2-2 TransId=35 Old=Inserted(31,0) New=Locked(35,1)
Unlock T2-2 TransId=35 Old=Locked(35,1) New=Inserted(31,0)
 
T1-0 correctly remains locked because it was selected FOR UPDATE.

But in the test for falcon_deadlock, T1 column id is not a primary key.  You can try this by repeating the SQL above except with this instead;
CREATE TABLE t1 (id INTEGER, x INTEGER) ENGINE falcon;

In order to determine which records in T1 match the criteria and therefore should remain locked, the server must fetch each record with a lock and then unlock the ones that that do not match.  It does this for each record in T2.  The following list shows the record version changes that occur to the base record as pointer to by the record tree. 

Insert T1-0 TransId=85 Old=(0,0) New=Inserted(85,0)
Insert T1-1 TransId=85 Old=(0,0) New=Inserted(85,0)
Insert T2-0 TransId=87 Old=(0,0) New=Inserted(87,0)
Insert T2-1 TransId=87 Old=(0,0) New=Inserted(87,0)
Insert T2-2 TransId=87 Old=(0,0) New=Inserted(87,0)
Lock   T2-0 TransId=91 Old=Inserted(87,0) New=Locked(91,1)
Lock   T1-0 TransId=91 Old=Inserted(85,0) New=Locked(91,1)
Lock   T1-1 TransId=91 Old=Inserted(85,0) New=Locked(91,1)
Unlock T1-1 TransId=91 Old=Locked(91,1) New=Inserted(85,0)
Lock   T2-1 TransId=91 Old=Inserted(87,0) New=Locked(91,1)
Unlock T1-0 TransId=91 Old=Locked(91,1) New=Inserted(85,0)
Lock   T1-1 TransId=91 Old=Inserted(85,0) New=Locked(91,1)
Unlock T1-1 TransId=91 Old=Locked(91,1) New=Inserted(85,0)
Unlock T2-1 TransId=91 Old=Locked(91,1) New=Inserted(87,0)
Lock   T2-2 TransId=91 Old=Inserted(87,0) New=Locked(91,1)
Lock   T1-0 TransId=91 Old=Inserted(85,0) New=Locked(91,1)
Unlock T1-0 TransId=91 Old=Locked(91,1) New=Inserted(85,0)
Lock   T1-1 TransId=91 Old=Inserted(85,0) New=Locked(91,1)
Unlock T1-1 TransId=91 Old=Locked(91,1) New=Inserted(85,0)
Unlock T2-2 TransId=91 Old=Locked(91,1) New=Inserted(87,0)

In the first pass through T1, while trying to match for T2-0, it locks T1-0 and does not unlock it since it matched the criteria.  But when it passed through T1 a second time, the server is using the same savepoint to lock and unlock these records a second time.  Falcon uses the existing lock at savepoint 1 for the second lock and then when b=2 does not match id=1, it unlocks T1-0 even though it was supposed to remain locked for update.  It was locked at savepoint 1 and is unlcoked at savepoint 1.

I am looking into how to avoid this.
[12 Jun 2009 18:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76217

2729 Kevin Lewis	2009-06-12
      Bug#43554 - This is an unorthodox, special case fix for a bug that originates from different assumptions about how a storage enginge keeps track of locks.  Falcon will reuse a record lock if it is asked for again within the same transaction.  MySQL server assumes that the storage engine can lock a record twice and unlock it once, leaving it locked for the future of that transaction.  Falcon will not unlock a record if it was locked at an earlier savepoint.  But the server, in the test case for this bug, locks the record twice at the same savepoint.  To make up for this bad assumption in the server, this patch will lower the savepoint ID by 1 if a lock request comes in a second time at the same savepoint.  This will hold a lock slightly longer within the transaction, which if there is a condition where it is not intended, is not as bad as giving up an intended lock.
[18 Jun 2009 16:37] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76585

2733 Kevin Lewis	2009-06-18
      Bug#43554 - Move falcon_deadlock test to suite/falcon