Bug #46673 Deadlock between FLUSH TABLES WITH READ LOCK and DML
Submitted: 12 Aug 2009 12:39 Modified: 7 Mar 2010 1:55
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S2 (Serious)
Version:5.4, azalea OS:Any
Assigned to: Dmitry Lenev CPU Architecture:Any
Tags: mdl, regression

[12 Aug 2009 12:39] Philip Stoev
Description:
If two threads execute the following workload:
Thread1 : START TRANSACTION + DML ;
Thread2 : FLUSH TABLES WITH READ LOCK + UNLOCK TABLES;

the server will deadlock almost immediately with the two threads in the following backtraces:

# 15:36:13 #1  0x0000000000a15df1 in safe_cond_wait (cond=0x1059420, mp=0x1058740, file=0xba896d "sql_base.cc", line=1056) at thr_mutex.c:423
# 15:36:13 #2  0x00000000006d99cf in close_cached_tables (thd=0x7f06f0ccd1b8, tables=0x0, have_lock=false, wait_for_refresh=true) at sql_base.cc:1056
# 15:36:13 #3  0x00000000006771f3 in reload_acl_and_cache (thd=0x7f06f0ccd1b8, options=16388, tables=0x0, write_to_binlog=0x7f06edf742af) at sql_parse.cc:6890
# 15:36:13 #4  0x00000000006828fb in mysql_execute_command (thd=0x7f06f0ccd1b8) at sql_parse.cc:4050
# 15:36:13 #5  0x0000000000685591 in mysql_parse (thd=0x7f06f0ccd1b8, inBuf=0x2b0e230 "FLUSH TABLES WITH READ LOCK", length=27, found_semicolon=0x7f06edf74f00)

and

# 15:36:13 #1  0x0000000000a15df1 in safe_cond_wait (cond=0x10594a0, mp=0x10589c0, file=0xb43037 "lock.cc", line=1307) at thr_mutex.c:423
# 15:36:13 #2  0x000000000065fe83 in wait_if_global_read_lock (thd=0x7f06f0cbe808, abort_on_refresh=false, is_not_commit=true) at lock.cc:1307
# 15:36:13 #3  0x0000000000680042 in mysql_execute_command (thd=0x7f06f0cbe808) at sql_parse.cc:3338
# 15:36:13 #4  0x0000000000685591 in mysql_parse (thd=0x7f06f0cbe808, inBuf=0x29fa560 "DELETE FROM `C` LIMIT 1", length=23, found_semicolon=0x7f06edf33f00)
# 15:36:13     at sql_parse.cc:5942

5.1 is not affected. A transactional storage engine is not required however autocommit should be off.

How to repeat:
RQG grammar:

thread1:
        START TRANSACTION | DELETE FROM _table LIMIT 1 ;

thread2:
        FLUSH TABLES WITH READ LOCK ; UNLOCK TABLES ;

RQG command line:

$ perl runall.pl \
  --basedir=/path/to/azalea \
  --grammar=/path/to/grammar \
  --threads=2 \
  --reporter=Deadlock,Backtrace
[13 Aug 2009 5:48] Sveta Smirnova
Thank you for the report.

Verified as described. Version 5.1 is not affected.
[19 Aug 2009 9:03] Dmitry Lenev
Here is the simple script for mysqltest which allows to reproduce this bug:

create table t1 (i int);

connect (connection_aux,localhost,root,,test,,);
connection connection_aux;

begin;
insert into t1 values (1);

connection default;
--send flush tables with read lock;

connection connection_aux;

--sleep 1

--echo # Below statement hangs...
delete from t1 where i = 1;
[28 Aug 2009 11:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/81847

2803 Dmitry Lenev	2009-08-28
      Fix for bug #46673 "Deadlock between FLUSH TABLES WITH READ LOCK and DML".
      
      Deadlocks occured when one concurrently executed transactions with
      several statements modifying data and FLUSH TABLES WITH READ LOCK
      statement or SET READ_ONLY=1 statement.
      
      These deadlocks were introduced by the patch for WL 4284: "Transactional
      DDL locking"/Bug 989: "If DROP TABLE while there's an active transaction,
      wrong binlog order" which has changed FLUSH TABLES WITH READ LOCK/SET
      READ_ONLY=1 to wait for pending transactions.
      What happened was that FLUSH TABLES WITH READ LOCK blocked all further 
      statements changing tables by setting global_read_lock global variable
      and has started waiting for all pending transactions to complete. 
      Then one of those transactions tried to executed DML, detected that
      global_read_lock non-zero and tried to wait until global read lock will 
      be released (i.e. global_read_lock becomes 0), indeed, this led to a
      deadlock.
      
      Proper solution for this problem should probably involve full integration 
      of global read lock with metadata locking subsystem (which will allow to 
      implement waiting for pending transactions without blocking DML in them).
      But since it requires significant changes another, short-term solution 
      for the problem is implemented in this patch.
       
      Basically, this patch restores behavior of FLUSH TABLES WITH READ LOCK/
      SET READ_ONLY=1 before the patch for WL 4284/bug 989. By ensuring that
      extra references to TABLE_SHARE are not stored for active metadata locks 
      it changes these statements not to wait for pending transactions. 
      As result deadlock is eliminated.
      Note that this does not change the fact that active FLUSH TABLES WITH
      READ LOCK lock or SET READ_ONLY=1 prevent modifications to tables as
      they also block transaction commits.
     @ mysql-test/r/flush_block_commit.result
        Adjusted test case after change in FLUSH TABLES WITH READ LOCK behavior
        - it is no longer blocked by a pending transaction.
     @ mysql-test/r/mdl_sync.result
        Added test for bug #46673 "Deadlock between FLUSH TABLES WITH READ LOCK
        and DML".
     @ mysql-test/r/read_only_innodb.result
        Adjusted test case after change in SET READ_ONLY behavior - it is no
        longer blocked by a pending transaction.
     @ mysql-test/t/flush_block_commit.test
        Adjusted test case after change in FLUSH TABLES WITH READ LOCK behavior
        - it is no longer blocked by a pending transaction.
     @ mysql-test/t/mdl_sync.test
        Added test for bug #46673 "Deadlock between FLUSH TABLES WITH READ LOCK
        and DML".
     @ mysql-test/t/read_only_innodb.test
        Adjusted test case after change in SET READ_ONLY behavior - it is no
        longer blocked by a pending transaction.
     @ sql/sql_base.cc
        Disable caching of pointers to TABLE_SHARE objects in MDL subsystem.
        This means that transactions holding metadata lock on the table will
        no longer have extra reference to the TABLE_SHARE (due to this lock)
        and will no longer block concurrent FLUSH TABLES/FLUSH TABLES WITH
        READ LOCK. Note that this does not change the fact that FLUSH TABLES
        WITH READ LOCK prevents concurrent transactions from modifying data
        as it also blocks all commits.
[1 Sep 2009 15:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/82160

2805 Dmitry Lenev	2009-09-01
      Fix for bug #46673 "Deadlock between FLUSH TABLES WITH READ LOCK and DML".
      
      Deadlocks occured when one concurrently executed transactions with
      several statements modifying data and FLUSH TABLES WITH READ LOCK
      statement or SET READ_ONLY=1 statement.
      
      These deadlocks were introduced by the patch for WL 4284: "Transactional
      DDL locking"/Bug 989: "If DROP TABLE while there's an active transaction,
      wrong binlog order" which has changed FLUSH TABLES WITH READ LOCK/SET
      READ_ONLY=1 to wait for pending transactions.
      What happened was that FLUSH TABLES WITH READ LOCK blocked all further 
      statements changing tables by setting global_read_lock global variable
      and has started waiting for all pending transactions to complete. 
      Then one of those transactions tried to executed DML, detected that
      global_read_lock non-zero and tried to wait until global read lock will 
      be released (i.e. global_read_lock becomes 0), indeed, this led to a
      deadlock.
      
      Proper solution for this problem should probably involve full integration 
      of global read lock with metadata locking subsystem (which will allow to 
      implement waiting for pending transactions without blocking DML in them).
      But since it requires significant changes another, short-term solution 
      for the problem is implemented in this patch.
       
      Basically, this patch restores behavior of FLUSH TABLES WITH READ LOCK/
      SET READ_ONLY=1 before the patch for WL 4284/bug 989. By ensuring that
      extra references to TABLE_SHARE are not stored for active metadata locks 
      it changes these statements not to wait for pending transactions. 
      As result deadlock is eliminated.
      Note that this does not change the fact that active FLUSH TABLES WITH
      READ LOCK lock or SET READ_ONLY=1 prevent modifications to tables as
      they also block transaction commits.
     @ mysql-test/r/flush_block_commit.result
        Adjusted test case after change in FLUSH TABLES WITH READ LOCK behavior
        - it is no longer blocked by a pending transaction.
     @ mysql-test/r/mdl_sync.result
        Added test for bug #46673 "Deadlock between FLUSH TABLES WITH READ LOCK
        and DML".
     @ mysql-test/r/read_only_innodb.result
        Adjusted test case after change in SET READ_ONLY behavior - it is no
        longer blocked by a pending transaction.
     @ mysql-test/t/flush_block_commit.test
        Adjusted test case after change in FLUSH TABLES WITH READ LOCK behavior
        - it is no longer blocked by a pending transaction.
     @ mysql-test/t/mdl_sync.test
        Added test for bug #46673 "Deadlock between FLUSH TABLES WITH READ LOCK
        and DML".
     @ mysql-test/t/read_only_innodb.test
        Adjusted test case after change in SET READ_ONLY behavior - it is no
        longer blocked by a pending transaction.
     @ sql/sql_base.cc
        Disable caching of pointers to TABLE_SHARE objects in MDL subsystem.
        This means that transactions holding metadata lock on the table will
        no longer have extra reference to the TABLE_SHARE (due to this lock)
        and will no longer block concurrent FLUSH TABLES/FLUSH TABLES WITH
        READ LOCK. Note that this does not change the fact that FLUSH TABLES
        WITH READ LOCK prevents concurrent transactions from modifying data
        as it also blocks all commits.
[1 Sep 2009 15:58] Dmitry Lenev
Fix for this bug was queued into mysql-next-bugfixing tree.
[10 Sep 2009 14:07] Jon Olav Hauglid
Marked bug#47024 as a duplicate of this bug.
[15 Sep 2009 13:52] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090915134838-5nj3ycjfsqc2vr2f) (version source revid:dlenev@mysql.com-20090901155705-j0oajtoa9d21pxqg) (merge vers: 5.4.4-alpha) (pib:11)
[16 Sep 2009 1:01] Paul DuBois
Noted in 5.4.4 changelog.

Deadlock could occur when executing transactions containing
data-modifying statements at the same time as FLUSH TABLES WITH READ
LOCK or SET read_only=1 statements.
[9 Dec 2009 22:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/93400

3015 Konstantin Osipov	2009-12-09
      Backport of:
      ------------------------------------------------------------
      revno: 2617.68.10
      committer: Dmitry Lenev <dlenev@mysql.com>
      branch nick: mysql-next-bg46673
      timestamp: Tue 2009-09-01 19:57:05 +0400
      message:
        Fix for bug #46673 "Deadlock between FLUSH TABLES WITH READ LOCK and DML".
      
        Deadlocks occured when one concurrently executed transactions with
        several statements modifying data and FLUSH TABLES WITH READ LOCK
        statement or SET READ_ONLY=1 statement.
      
        These deadlocks were introduced by the patch for WL 4284: "Transactional
        DDL locking"/Bug 989: "If DROP TABLE while there's an active transaction,
        wrong binlog order" which has changed FLUSH TABLES WITH READ LOCK/SET
        READ_ONLY=1 to wait for pending transactions.
        What happened was that FLUSH TABLES WITH READ LOCK blocked all further
        statements changing tables by setting global_read_lock global variable
        and has started waiting for all pending transactions to complete.
        Then one of those transactions tried to executed DML, detected that
        global_read_lock non-zero and tried to wait until global read lock will
        be released (i.e. global_read_lock becomes 0), indeed, this led to a
        deadlock.
      
        Proper solution for this problem should probably involve full integration
        of global read lock with metadata locking subsystem (which will allow to
        implement waiting for pending transactions without blocking DML in them).
        But since it requires significant changes another, short-term solution
        for the problem is implemented in this patch.
      
        Basically, this patch restores behavior of FLUSH TABLES WITH READ LOCK/
        SET READ_ONLY=1 before the patch for WL 4284/bug 989. By ensuring that
        extra references to TABLE_SHARE are not stored for active metadata locks
        it changes these statements not to wait for pending transactions.
        As result deadlock is eliminated.
        Note that this does not change the fact that active FLUSH TABLES WITH
        READ LOCK lock or SET READ_ONLY=1 prevent modifications to tables as
        they also block transaction commits.
     @ mysql-test/r/flush_block_commit.result
        Adjusted test case after change in FLUSH TABLES WITH READ LOCK behavior
        - it is no longer blocked by a pending transaction.
     @ mysql-test/r/mdl_sync.result
        Added test for bug #46673 "Deadlock between FLUSH TABLES WITH READ LOCK
        and DML".
     @ mysql-test/r/read_only_innodb.result
        Adjusted test case after change in SET READ_ONLY behavior - it is no
        longer blocked by a pending transaction.
     @ mysql-test/t/flush_block_commit.test
        Adjusted test case after change in FLUSH TABLES WITH READ LOCK behavior
        - it is no longer blocked by a pending transaction.
     @ mysql-test/t/mdl_sync.test
        Added test for bug #46673 "Deadlock between FLUSH TABLES WITH READ LOCK
        and DML".
     @ mysql-test/t/read_only_innodb.test
        Adjusted test case after change in SET READ_ONLY behavior - it is no
        longer blocked by a pending transaction.
     @ sql/sql_base.cc
        Disable caching of pointers to TABLE_SHARE objects in MDL subsystem.
        This means that transactions holding metadata lock on the table will
        no longer have extra reference to the TABLE_SHARE (due to this lock)
        and will no longer block concurrent FLUSH TABLES/FLUSH TABLES WITH
        READ LOCK. Note that this does not change the fact that FLUSH TABLES
        WITH READ LOCK prevents concurrent transactions from modifying data
        as it also blocks all commits.
[16 Feb 2010 16:49] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100216101445-2ofzkh48aq2e0e8o) (version source revid:kostja@sun.com-20091211154405-c9yhiewr9o5d20rq) (merge vers: 6.0.14-alpha) (pib:16)
[16 Feb 2010 16:59] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100216101208-33qkfwdr0tep3pf2) (version source revid:kostja@sun.com-20091209155634-6mypyruinnqn3sq9) (pib:16)
[16 Feb 2010 18:40] Dmitry Lenev
Closing this bug as it is not repeatable in any publicly available trees with versions < 6.0.
[6 Mar 2010 11:07] Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20100216221947-luyhph0txl2c5tc8) (merge vers: 5.5.99-m3) (pib:16)
[7 Mar 2010 1:55] Paul DuBois
No changelog entry needed.