Bug #54436 Deadlock on concurrent FLUSH WITH READ LOCK, ALTER TABLE, ANALYZE TABLE
Submitted: 11 Jun 2010 16:55 Modified: 17 Mar 2011 19:04
Reporter: Matthias Leich Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:5.1.38-bzr, mysql-trunk-runtime OS:Any
Assigned to: CPU Architecture:Any
Tags: deadlock

[11 Jun 2010 16:55] Matthias Leich
Description:
RQG grammar deadlock.yy
-----------------------
query:
        ALTER TABLE t1 ENGINE = InnoDB |
        ANALYZE TABLE t1               |
        FLUSH TABLES WITH READ LOCK ; SELECT SLEEP( 0.01 ) ; UNLOCK TABLES;

thread1_init:
        CREATE TABLE t1 (col1 BIGINT);

The RQG zz grammar for the table generation is not required.
So you can assign whatever you want.

Result on mysql-trunk-runtime revno: 3052 2010-06-11
(5.5.5-m3-debug-log)
----------------------------------------------------
...
nearly no activity on the server
...
SHOW PROCESSLIST output (a bit shortened) is very stable except that
the Time goes up
Id Command Time  State                          Info
1   Sleep  57  NULL
5   Query  56  Waiting for release of readlock  ANALYZE TABLE t1
6   Query  56  Flushing tables                  FLUSH TABLES WITH READ LOCK
7   Query  54  Waiting for release of readlock  ALTER TABLE t1 ENGINE = InnoDB
8   Query  54  Flushing tables                  FLUSH TABLES WITH READ LOCK
9   Query  56  Waiting for release of readlock  ALTER TABLE t1 ENGINE = InnoDB
10  Query  56  Waiting for release of readlock  ALTER TABLE t1 ENGINE = InnoDB
11  Query  56  Flushing tables                  FLUSH TABLES WITH READ LOCK
12  Query  54  Flushing tables                  FLUSH TABLES WITH READ LOCK
13  Query  56  Waiting for release of readlock  ALTER TABLE t1 ENGINE = InnoDB
14  Query  56  Waiting for release of readlock  ALTER TABLE t1 ENGINE = InnoDB
15  Query  56  Flushing tables                  FLUSH TABLES WITH READ LOCK
16  Query  54  Flushing tables                  FLUSH TABLES WITH READ LOCK
17  Query  55  Waiting for release of readlock  ALTER TABLE t1 ENGINE = InnoDB
18  Query  55  Waiting for release of readlock  ALTER TABLE t1 ENGINE = InnoDB
19  Query  55  Flushing tables                  FLUSH TABLES WITH READ LOCK
20  Query  51  Flushing tables                  FLUSH TABLES WITH READ LOCK
36  Query   0  NULL                             SHOW PROCESSLIST

Note:
  The deadlock disappears if I remove either
  - ALTER TABLE t1 ENGINE = InnoDB    or
  - ANALYZE TABLE t1                  or
  - FLUSH TABLES WITH READ LOCK ; SELECT SLEEP( 0.01 ) ; UNLOCK TABLES;

How to repeat:
Command from my Linux box
-------------------------
export MTR_BUILD_THREAD=255
mkdir /dev/shm/var_"$MTR_BUILD_THREAD"
nice -20 perl runall.pl \
--debug \
--duration=200 \
--queries=3000 \
--reporter=Deadlock,Backtrace \
--threads=16 \
--basedir=/work2/repo/mysql-trunk-runtime \
--mysqld=--lock-wait-timeout=2 \
--mysqld=--log-output=table \
--mysqld=--secure-file-priv=/tmp/ \
--mysqld=--innodb-lock-wait-timeout=1 \
--mysqld=--loose-innodb-lock-wait-timeout=1 \
--vardir=/dev/shm/var_"$MTR_BUILD_THREAD" \
--gendata=conf/runtime/WL5004_data.zz \
--grammar=./deadlock.yy \
--seed=54 | tee deadlock.prt 2>&1
[11 Jun 2010 17:10] Matthias Leich
RQG output with backtrace

Attachment: deadlock.prt (application/octet-stream, text), 73.60 KiB.

[11 Jun 2010 20:50] Dmitry Lenev
I was able to repeat this bug on 5.1-based tree mysql-5.1-bugteam.
[1 Sep 2010 11:47] Jon Olav Hauglid
Not able to repeat this bug in 5.5 (5.5-runtime tree).
Still able to repeat it in 5.1.
[9 Mar 2011 16:38] J Jorgenson
This problem is easily replicated under 5.1.49sp1 with a simple test:

  CREATE TABLE x (id INT, primary key(id));
  CREATE TABLE y (fname VCHAR(25), primary key(fname));

Connection A:
  LOCK TABLE x WRITE;
  FLUSH TABLE;

Connection B:
  LOCK TABLE y WRITE;
  FLUSH TABLE;  
   <-- CONNECTION hangs with "Flushing Tables" until 
       connection A issues 'UNLOCK TABLES' or disconnects.

Because of this, our various processes acting upon unrelated
tables is forced to execute sequentially!  

I'd like to vote this bug up to a more impactful severity because it affects all production activities in an system with more than 1 active process.
[9 Mar 2011 18:00] Davi Arnaut
J Jorgenson,

What you describe is not a bug. FLUSH TABLE, which is a synonym for FLUSH TABLES, attempts to close (flush) all open table instances. Giving that holding a lock on a table instance prevents it from being flushed, FLUSH TABLES won't succeed until other sessions release their table locks.
[9 Mar 2011 21:57] J Jorgenson
Sorry. I had missed putting in the specific table being flushed: 
Connection A)
 LOCK TABLE x WRITE;
 FLUSH TABLE x;

Connection B)
 LOCK TABLE y WRITE;
 FLUSH TABLE y;
[9 Mar 2011 22:05] J Jorgenson
Here is another scenario where unrelated connections block each other:

Connection A)
  LOCK TABLE x WRITE;
  ANALYZE TABLE x;

Connection B)
  LOCK TABLE y WRITE;
  FLUSH TABLE y;  <-- Connection hangs, waiting for 'A'.

Connection A)
  UNLOCK TABLES;  <-- Connection 'B' resumes processing.
[9 Mar 2011 22:52] Davi Arnaut
Indeed, these are bugs in the FLUSH TABLE implementation in 5.1. These and other issues have been addressed in MySQL 5.5, please consider upgrading.
[17 Mar 2011 19:04] Omer Barnir
Issue is fixed in 5.5 and will not be back ported to 5.1
[17 Mar 2011 21:09] J Jorgenson
For reference by others whom experience this same issue:

We have subsequently found a patch 75044 that can be applied to 5.1  (we tested using 5.1.49sp1) which can be found at: http://lists.mysql.com/commits/75044 and was pushed into the 5.5.3 release as mentioned in bug:  http://bugs.mysql.com/bug.php?id=43685 .  

This patch is very small and is specific to resolving the 'Flush Table x' locking issue implied by a global flush. 

The performance gains of this patch were very significant, due to the complete removal of locks by table specific flushes, implied by waiting for unrelated tables to be unlocked.  Our systems, which heavily rely upon singular table flushing, went from processing running in a serialized fashion to heavily parallel high-performance environment.

-- JJ --