MySQL Bugs: #65890: Deadlock that is not a deadlock with transaction and lock tables

Bug #65890	Deadlock that is not a deadlock with transaction and lock tables
Submitted:	13 Jul 2012 9:17	Modified:	12 Dec 2015 23:23
Reporter:	Konstantin Malov	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Documentation	Severity:	S2 (Serious)
Version:	5.5.X	OS:	Linux (CentOS 6.X)
Assigned to:	Paul DuBois	CPU Architecture:	Any
Tags:	deadlock lock alphabetical

Description:
Here is a report from Maxim Arhipov about misterious deadlocks:

Concurrent transactional queries and table locks cause some strange deadlocks.

Example:
/* Session 1 */ SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; /* ok */
/* Session 1 */ START TRANSACTION; /* ok */
/* Session 1 */ SELECT * FROM B WHERE F=1 FOR UPDATE; /* ok */
/* Session 2 */ LOCK TABLES A WRITE, B WRITE; /* Waits lock of first session */
/* Session 1 */ SELECT * FROM A WHERE F=1; /* Throws "ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction" */

It is strange deadlock. It should not be. Session 2 must wait completion of transaction of session 1 before lock acquiring.
In section "LATEST DETECTED DEADLOCK" of "show engine innodb status" we do not see this deadlock. It is not logged properly. Also in this case Percona status variable innodb_deadlocks doesn't increment. And of course we see "normal" deadlocks. 
Deadlock depends on alphabetical order of table names. If we swap B and A tables in SELECT queries of session 1 then deadlock error is not thrown. 

We saw that strange thing on several 5.5.X versions (5.5.16 and 5.5.24) from Percona and on 5.5.21 from Remi repo.

Summary:
1. Deadlock that should not be.
2. Deadlock is not logged properly.
3. Deadlock depends on table names.

How to repeat:
/* Creating empty tables */
DROP TABLE IF EXISTS A; DROP TABLE IF EXISTS B;
CREATE TABLE A (F INT(11)) ENGINE=InnoDB;
CREATE TABLE B (F INT(11)) ENGINE=InnoDB;

/* Deadlock does not exist */
/* Session 1 */ SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; START TRANSACTION; SELECT * FROM A WHERE F=1 FOR UPDATE;
/* Session 2 */ LOCK TABLES A WRITE, B WRITE;
/* Session 1 */ SELECT * FROM B WHERE F=1;

/* Cleaning */
/* Session 1 */ ROLLBACK;
/* Session 2 */ UNLOCK TABLES;

/* Deadlock exists */
/* Session 1 */ SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; START TRANSACTION; SELECT * FROM B WHERE F=1 FOR UPDATE;
/* Session 2 */ LOCK TABLES A WRITE, B WRITE;
/* Session 1 */ SELECT * FROM A WHERE F=1;

MySQL 5.5.25a :

mysql> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; START TRANSACTION; SELECT * FROM B WHERE F=1 FOR UPDATE;
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Empty set (0.00 sec)

mysql> SELECT * FROM A WHERE F=1;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

Thanks for this bug report. It is verified as described.

Hello!

Thanks you for your report and analysis!

What you observe is expected behavior of metadata locking subsystem and not
a bug. Let me explain what exactly happens and why it happens.

Each transaction or statement need to acquire metadata lock on a table before
using it.

In your case transaction in session 1 needs to acquire weak lock on table A
and then weak lock on table B. At the same time LOCK TABLES WRITE statement
in session 2 needs to acquire strong lock on both table A and B. This strong
lock has to conflict with weak locks from transaction in session 1 as on
tables locked with LOCK TABLE WRITE it should be possible to perform DDL
such as ALTER TABLE or even DROP TABLE without deadlocks or additional
waiting.

Now the deadlock happens because:
- first transaction in session 1 acquires lock on table B,
- LOCK TABLES WRITE in session 2 starts by acquiring lock on A
and then starts waiting for lock on B (i.e. for session 1)
- transaction in session 1 tries to acquire lock on A, starts
waiting for session 2, thus creating deadlock.
- deadlock resolved by aborting wait in session 1 and reporting
deadlock error.

Now several additional notes:

- We can't avoid this deadlock by acquiring all locks for DML transaction
and LOCK TABLES in advance in the same order, since for transactions
which execute statements ad-hoc we can't predict which tables their are
going to use.
- We can't avoid this deadlock by ensuring that LOCK TABLES WRITE
acquires lock on all tables mentioned in it only when all
transactions using these tables are committed. This would mean
that stream of concurrent transactions could easily starve
LOCK TABLES WRITE.
- We are choosing DML transaction and not LOCK TABLES WRITE as a
victim of deadlock resolving in this case since a) this is
backward compatible behavior (AFAIK LOCK TABLES were not
normally aborted with deadlock errors prior to 5.5 and DML
transaction were) b) in general case LOCK TABLES WRITE should
be more rare operation in the context where transactions are
used, so probably it has a special role. (We could have chosen
LOCK TABLES WRITE as a victim and instead of giving out deadlock
error tried to re-acquire all locks. But such approach can easily
lead to starvation of LOCK TABLES WRITE statements again),
- LOCK TABLES WRITE acquires locks on tables in the alphabetic
order. This is done to reduce risk of deadlocks between two
concurrent LOCK TABLES WRITE statements (and between LOCK TABLES
WRITE and DDL as well).
- Since this is metadata-locking subsystem deadlock and not a normal
InnoDB deadlock it is not show in SHOW ENGINE INNODB STATUS and is
not logged in the log.

Still your report has two very valid points:

- Our documentation could have been more clear and elaborate on the subject.
For example, http://dev.mysql.com/doc/refman/5.5/en/metadata-locking.html
could have mentioned that LOCK TABLES WRITE acquires metadata locks
similar to those acquired by DDL statements.

Therefore, I'm converting this bug-report to request to improve documentation.

- Writing information about such deadlocks to the log is in my opinion a valid feature request. Adding some statistical counter for it is a good idea as well.

I have created a separate feature request for this, see:
http://bugs.mysql.com/bug.php?id=69338

Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly.

http://dev.mysql.com/doc/refman/5.5/en/metadata-locking.html mentions effect of LOCK TABLES ... WRITE now.