Bug #1866 Attempting to insert into a gap you hold a row lock on can deadlock
Submitted: 17 Nov 2003 20:22 Modified: 18 Nov 2003 5:14
Reporter: Will Bryant Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:4.0.16 OS:Linux (Linux)
Assigned to: CPU Architecture:Any

[17 Nov 2003 20:22] Will Bryant
Description:
While using InnoDB tables and the repeatable read isolation level, the following sequence of events:

1. One connection does a non-insert update operation (UPDATE, SELECT ... FOR UPDATE, or DELETE) but there is no matching row, giving this connection a lock on that gap
2. A second connection does a non-insert update operation on the same gap (blocks, waiting for the gap lock)
3. The first connection then tries to do an INSERT into that gap

is incorrectly detected as a deadlock (and the first connection's transaction is rolled back).

In other words, if one connection attempts to insert into a gap that it already holds a lock on, a deadlock condition is encountered if another connection has started waiting on that gap lock in the meantime (eg. by attempting to update that index position).

This is not correct behaviour; the connection already holds the lock on that index position, so it doesn't need to re-acquire it and so there is in fact no deadlock.

How to repeat:
To set up, run:
CREATE TABLE test (pri_id INT NOT NULL PRIMARY KEY, other_data INT NOT NULL) TYPE=InnoDB;

Use two connections, one for the statements on the left, the other for the statements on the right, and use "SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; SET AUTOCOMMIT=0;" on both.

UPDATE test 
   SET other_data = 42
 WHERE (pri_id = 1);
--- should indicate that 0 rows were updated
--- we now have the lock on the gap

                                          UPDATE test
                                             SET other_data = 23
                                           WHERE (pri_id = 1);
                                          --- blocks, as it should, 
                                          --- waiting for the lock

INSERT INTO test
   SET pri_id = 1,
       other_data = 42;
--- the deadlock detector detects a
--- deadlock and rolls back the transaction

                                          --- this statement then executes
                                          --- (and updates 0 rows)

Interestingly it is necessary for the first statement on both connections to be something other than an INSERT (SELECT ... FOR UPDATE does the same thing, but everything works fine if you try and do INSERTs).  So it is specifically to do with the locking of _gaps_ for update.

Suggested fix:
The first transaction should continue successfully, the second transaction should continue blocking until the first commits or rolls back, and the second should then continue (in this example, updating one row if the first committed, or zero rows if it rolled back).

I'm not sure whether this is a problem with the deadlock detector or with the locking logic; but either way, this is not a real deadlock condition - the first connection already holds the lock, so it doesn't need to re-acquire it and so there is in fact no deadlock.

In more general terms, while waiting on a lock should be FIFO, if you already hold the lock in question you certainly shouldn't wait at all.

Further evidence for this argument is that, as above, if you do an INSERT and then an UPDATE on that inserted row, it won't deadlock - the second connection keeps waiting correctly - so it is a problem specifically with it being a gap lock.
[18 Nov 2003 5:14] Heikki Tuuri
Hi!

The problem is that an index cursor in InnoDB never reverses if it needs to wait for a lock.

Suppose we have index records

"AA" "AC",

T1 has a next-key lock on "AC", and T2 is waiting for a next-key lock on "AC". 

Can T1 insert "AB" to the gap between "AA" and "AC"? No, because the index cursor of T1 has already passed that position and is waiting on "AC". If T1 is doing a SELECT which should see "AB", then allowing T2 to insert would cause nonserializable execution.

Since purge can remove index records, it can happen that 2 transactions have a GRANTED lock on the same gap. That is why I have chosen in InnoDB that a lock on a gap only has an inhibitive action: it can prevent an insert by another trx, but it does not give a transaction a right to insert.

We could optimize this by letting T1 to reverse its cursor and temporarily release the waiting lock request on "AC", if T2 wants to insert to the gap. But it is not easy to make a correct algorithm.

Best regards,

Heikki
[18 Nov 2003 5:16] Heikki Tuuri
Correct the typo:

Can T1 insert "AB" to the gap between "AA" and "AC"? No, because the index
cursor of T2...

Regards,

Heikki