Bug #50913 Deadlock between open_and_lock_tables_derived and MDL
Submitted: 4 Feb 2010 13:30 Modified: 7 Mar 2010 1:04
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S2 (Serious)
Version:mysql-next-4284 OS:Any
Assigned to: Dmitry Lenev CPU Architecture:Any

[4 Feb 2010 13:30] Philip Stoev
Description:
When executing the regular DDL/MDL RQG workload (without transactions and with autocommit=on in order to prevent legitimate hangs due to open transactions), mysqld deadlocked with all threads in MDL and one thread as follows:

3  0x00000000009cfd51 in wait_for_lock (wait=0x276bea0, data=0x2768528, in_wait_list=0 '\0') at thr_lock.c:448
#4  0x00000000009d0932 in thr_lock (data=0x2768528, owner=0x277d8a8, lock_type=TL_WRITE) at thr_lock.c:785
#5  0x00000000009d1450 in thr_multi_lock (data=0x26a9288, count=1, owner=0x277d8a8) at thr_lock.c:1054
#6  0x0000000000628752 in mysql_lock_tables (thd=0x277cb08, tables=0x27794b0, count=1, flags=0, need_reopen=0x7fd4bd471feb) at lock.cc:320
#7  0x000000000069514e in lock_tables (thd=0x277cb08, tables=0x2778908, count=1, flags=0, need_reopen=0x7fd4bd471feb) at sql_base.cc:5435
#8  0x000000000069cc02 in open_and_lock_tables_derived (thd=0x277cb08, tables=0x2778908, derived=true, flags=0, prelocking_strategy=0x7fd4bd472030)
    at sql_base.cc:5117
#9  0x000000000064cc2e in open_and_lock_tables_derived (thd=0x277cb08, tables=0x2778908, derived=true, flags=0) at mysql_priv.h:1561
#10 0x000000000064cc69 in open_and_lock_tables (thd=0x277cb08, tables=0x2778908) at mysql_priv.h:1571
#11 0x0000000000645202 in mysql_execute_command (thd=0x277cb08) at sql_parse.cc:3101
#12 0x0000000000649bc2 in mysql_parse (thd=0x277cb08, inBuf=0x2778258 "DELETE   A FROM testdb_N . t1_base1_N  AS A WHERE `pk` BETWEEN 4 AND 4 + 1",
    length=74, found_semicolon=0x7fd4bd473ee0) at sql_parse.cc:5586
#13 0x000000000064a7db in dispatch_command (command=COM_QUERY, thd=0x277cb08,
    packet=0x277f779 "DELETE   A FROM testdb_N . t1_base1_N  AS A WHERE `pk` BETWEEN 4 AND 4 + 1", packet_length=74) at sql_parse.cc:1020
#14 0x000000000064bc3a in do_command (thd=0x277cb08) at sql_parse.cc:706
---Type <return> to continue, or q <return> to quit---
#15 0x000000000063a6f0 in do_handle_one_connection (thd_arg=0x277cb08) at sql_connect.cc:1172
#16 0x000000000063a7af in handle_one_connection (arg=0x277cb08) at sql_connect.cc:1112
#17 0x000000315b0073da in start_thread () from /lib64/libpthread.so.0
#18 0x000000315a4e627d in clone () from /lib64/libc.so.6

How to repeat:
Core and binary will be uploaded shortly.
[4 Feb 2010 13:30] Philip Stoev
c:\docs\mysql\bugs\bug50913.threads

Attachment: bug50913.threads.txt (text/plain), 32.80 KiB.

[4 Feb 2010 13:36] Philip Stoev
Core and binary:

http://mysql-systemqa.s3.amazonaws.com/var-bug50913.zip

Source:

revision-id: jon.hauglid@sun.com-20100204090036-bu3x1qwj4wjg3xd6
date: 2010-02-04 10:00:36 +0100
build-date: 2010-02-04 15:26:48 +0200
revno: 3073
branch-nick: mysql-next-4284
[8 Feb 2010 20:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/99636

3086 Dmitry Lenev	2010-02-08
      Fix for bug #50913 "Deadlock between open_and_lock_tables_derived
      and MDL".
      
      Concurrent execution of a multi-DELETE statement and ALTER
      TABLE statement which affected one of the tables used in
      the multi-DELETE sometimes led to deadlock.
      Similar deadlocks might have occured when one performed
      INSERT/UPDATE/DELETE on a view and concurrently executed
      ALTER TABLE for the view's underlying table, or when one
      concurrently executed TRUNCATE TABLE for InnoDB table and
      ALTER TABLE for the same table.
      
      These deadlocks were caused by a discrepancy between types of
      metadata and thr_lock.cc locks acquired by those statements.
      
      What happened was that multi-DELETE/TRUNCATE/DML-through-the-
      view statement in the first connection acquired SR lock on a
      table, then ALTER TABLE would come in in the second connection
      and acquire SNW metadata lock and TL_WRITE_ALLOW_READ
      thr_lock.c lock and then would start waiting for the first
      connection during lock upgrade. After that the statement in
      the first connection would try to acquire TL_WRITE lock on
      table and would start waiting for the second connection,
      creating a deadlock.
      
      This patch solves this problem by ensuring that we acquire
      SW metadata lock in all cases in which we acquiring write
      thr_lock.c lock. This guarantees that deadlocks like the
      one described above won't occur since all lock conflicts
      in such situation are resolved within MDL subsystem.
      
      This patch also adds assert which should guarantee that
      such situations won't arise in future.
     @ mysql-test/r/lock_multi.result
        Added main test for bug #50913 "Deadlock between
        open_and_lock_tables_derived and MDL".
     @ mysql-test/r/mdl_sync.result
        Added additional coverage for bug #50913 "Deadlock
        between open_and_lock_tables_derived and MDL".
     @ mysql-test/t/lock_multi.test
        Added main test for bug #50913 "Deadlock between
        open_and_lock_tables_derived and MDL".
     @ mysql-test/t/mdl_sync.test
        Added additional coverage for bug #50913 "Deadlock
        between open_and_lock_tables_derived and MDL".
     @ sql/lock.cc
        Added assert that enforces that when we are locking
        a non-temporary table we have an appropriate type of
        metadata lock on this table.
     @ sql/mysql_priv.h
        Added separate flag for open_tables() to be able specify that
        SH metadata locks on table to be open should be acquired.
        We can no longer use MYSQL_LOCK_IGNORE_FLUSH flag for this
        as in addition to use in I_S implementation it is also used
        for opening system tables. Since in the latter case we also
        acquire thr_lock.c locks using SH metadata lock in it instead
        of SR or SW locks may lead to deadlock.
     @ sql/sql_base.cc
        When opening tables don't interpret MYSQL_LOCK_IGNORE_FLUSH
        flag as request to acquire SH metadata locks. This flag is
        also used for opening system tables for which we also take
        thr_lock.c locks and thus proper metadata lock to take in
        this case is SR or SW lock (otherwise deadlocks can occur).
        In cases when SH lock is really required (e.g. when tables
        are open by I_S implementation) we rely on that newly
        introduced MYSQL_OPEN_FORCE_SHARED_HIGH_PRIO_MDL flag is
        used.
     @ sql/sql_delete.cc
        mysql_truncate_by_delete():
          Adjust type of metadata lock to be requested after changing
          type of thr_lock.c lock for table list element from one
          which was set in parser to TL_WRITE.
          This removes discrepancy between types of these locks which
          allowed deadlocks to creep in.
     @ sql/sql_handler.cc
        When closing table which was open by HANDLER statement clear
        TABLE::open_by_handler flag. This allows to use this flag as
        a reliable indication that TABLE instance was open by HANDLER
        statement in assert which was added to mysql_lock_tables().
     @ sql/sql_parse.cc
        multi_delete_set_locks_and_link_aux_tables():
          Adjust type of metadata lock to be requested after changing
          type of thr_lock.c lock for table list element from one
          which was set in parser to TL_WRITE.
          This removes discrepancy between types of these locks which
          allowed deadlocks to creep in.
     @ sql/sql_show.cc
        Use newly introduced MYSQL_OPEN_FORCE_SHARED_HIGH_PRIO_MDL
        flag in order to acquire SH metadata locks when opening tables
        in I_S implementation.
     @ sql/sql_update.cc
        Added comment explaining why in multi-update after deciding
        that we need weaker thr_lock.c lock on a table we don't
        downgrade metadata lock on it.
     @ sql/sql_view.cc
        When merging view into main statement adjust type of metadata
        lock to be requested after changing type of thr_lock.c lock
        for table. This removes discrepancy between types of these
        locks which allowed deadlocks to creep in.
[9 Feb 2010 9:59] Dmitry Lenev
Fix for this bug was pushed into mysql-next-4284 tree. Since this bug was not repeatable outside of this tree and it is not publicly available there is nothing to document. So I am simply closing this bug report.
[16 Feb 2010 16:46] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100216101445-2ofzkh48aq2e0e8o) (version source revid:jon.hauglid@sun.com-20100211140522-unpky24gmq8fkhhj) (merge vers: 6.0.14-alpha) (pib:16)
[16 Feb 2010 16:55] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100216101208-33qkfwdr0tep3pf2) (version source revid:dlenev@mysql.com-20100208201955-ma7wws6qhymcovox) (pib:16)
[6 Mar 2010 10:56] Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20100216221947-luyhph0txl2c5tc8) (merge vers: 5.5.99-m3) (pib:16)
[7 Mar 2010 1:04] Paul DuBois
No changelog entry needed.