| Bug #50913 | Deadlock between open_and_lock_tables_derived and MDL | ||
|---|---|---|---|
| Submitted: | 4 Feb 2010 13:30 | Modified: | 7 Mar 2010 1:04 |
| Reporter: | Philip Stoev | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Locking | Severity: | S2 (Serious) |
| Version: | mysql-next-4284 | OS: | Any |
| Assigned to: | Dmitry Lenev | CPU Architecture: | Any |
[4 Feb 2010 13:30]
Philip Stoev
c:\docs\mysql\bugs\bug50913.threads
Attachment: bug50913.threads.txt (text/plain), 32.80 KiB.
[4 Feb 2010 13:36]
Philip Stoev
Core and binary: http://mysql-systemqa.s3.amazonaws.com/var-bug50913.zip Source: revision-id: jon.hauglid@sun.com-20100204090036-bu3x1qwj4wjg3xd6 date: 2010-02-04 10:00:36 +0100 build-date: 2010-02-04 15:26:48 +0200 revno: 3073 branch-nick: mysql-next-4284
[8 Feb 2010 20:20]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/99636 3086 Dmitry Lenev 2010-02-08 Fix for bug #50913 "Deadlock between open_and_lock_tables_derived and MDL". Concurrent execution of a multi-DELETE statement and ALTER TABLE statement which affected one of the tables used in the multi-DELETE sometimes led to deadlock. Similar deadlocks might have occured when one performed INSERT/UPDATE/DELETE on a view and concurrently executed ALTER TABLE for the view's underlying table, or when one concurrently executed TRUNCATE TABLE for InnoDB table and ALTER TABLE for the same table. These deadlocks were caused by a discrepancy between types of metadata and thr_lock.cc locks acquired by those statements. What happened was that multi-DELETE/TRUNCATE/DML-through-the- view statement in the first connection acquired SR lock on a table, then ALTER TABLE would come in in the second connection and acquire SNW metadata lock and TL_WRITE_ALLOW_READ thr_lock.c lock and then would start waiting for the first connection during lock upgrade. After that the statement in the first connection would try to acquire TL_WRITE lock on table and would start waiting for the second connection, creating a deadlock. This patch solves this problem by ensuring that we acquire SW metadata lock in all cases in which we acquiring write thr_lock.c lock. This guarantees that deadlocks like the one described above won't occur since all lock conflicts in such situation are resolved within MDL subsystem. This patch also adds assert which should guarantee that such situations won't arise in future. @ mysql-test/r/lock_multi.result Added main test for bug #50913 "Deadlock between open_and_lock_tables_derived and MDL". @ mysql-test/r/mdl_sync.result Added additional coverage for bug #50913 "Deadlock between open_and_lock_tables_derived and MDL". @ mysql-test/t/lock_multi.test Added main test for bug #50913 "Deadlock between open_and_lock_tables_derived and MDL". @ mysql-test/t/mdl_sync.test Added additional coverage for bug #50913 "Deadlock between open_and_lock_tables_derived and MDL". @ sql/lock.cc Added assert that enforces that when we are locking a non-temporary table we have an appropriate type of metadata lock on this table. @ sql/mysql_priv.h Added separate flag for open_tables() to be able specify that SH metadata locks on table to be open should be acquired. We can no longer use MYSQL_LOCK_IGNORE_FLUSH flag for this as in addition to use in I_S implementation it is also used for opening system tables. Since in the latter case we also acquire thr_lock.c locks using SH metadata lock in it instead of SR or SW locks may lead to deadlock. @ sql/sql_base.cc When opening tables don't interpret MYSQL_LOCK_IGNORE_FLUSH flag as request to acquire SH metadata locks. This flag is also used for opening system tables for which we also take thr_lock.c locks and thus proper metadata lock to take in this case is SR or SW lock (otherwise deadlocks can occur). In cases when SH lock is really required (e.g. when tables are open by I_S implementation) we rely on that newly introduced MYSQL_OPEN_FORCE_SHARED_HIGH_PRIO_MDL flag is used. @ sql/sql_delete.cc mysql_truncate_by_delete(): Adjust type of metadata lock to be requested after changing type of thr_lock.c lock for table list element from one which was set in parser to TL_WRITE. This removes discrepancy between types of these locks which allowed deadlocks to creep in. @ sql/sql_handler.cc When closing table which was open by HANDLER statement clear TABLE::open_by_handler flag. This allows to use this flag as a reliable indication that TABLE instance was open by HANDLER statement in assert which was added to mysql_lock_tables(). @ sql/sql_parse.cc multi_delete_set_locks_and_link_aux_tables(): Adjust type of metadata lock to be requested after changing type of thr_lock.c lock for table list element from one which was set in parser to TL_WRITE. This removes discrepancy between types of these locks which allowed deadlocks to creep in. @ sql/sql_show.cc Use newly introduced MYSQL_OPEN_FORCE_SHARED_HIGH_PRIO_MDL flag in order to acquire SH metadata locks when opening tables in I_S implementation. @ sql/sql_update.cc Added comment explaining why in multi-update after deciding that we need weaker thr_lock.c lock on a table we don't downgrade metadata lock on it. @ sql/sql_view.cc When merging view into main statement adjust type of metadata lock to be requested after changing type of thr_lock.c lock for table. This removes discrepancy between types of these locks which allowed deadlocks to creep in.
[9 Feb 2010 9:59]
Dmitry Lenev
Fix for this bug was pushed into mysql-next-4284 tree. Since this bug was not repeatable outside of this tree and it is not publicly available there is nothing to document. So I am simply closing this bug report.
[16 Feb 2010 16:46]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100216101445-2ofzkh48aq2e0e8o) (version source revid:jon.hauglid@sun.com-20100211140522-unpky24gmq8fkhhj) (merge vers: 6.0.14-alpha) (pib:16)
[16 Feb 2010 16:55]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100216101208-33qkfwdr0tep3pf2) (version source revid:dlenev@mysql.com-20100208201955-ma7wws6qhymcovox) (pib:16)
[6 Mar 2010 10:56]
Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20100216221947-luyhph0txl2c5tc8) (merge vers: 5.5.99-m3) (pib:16)
[7 Mar 2010 1:04]
Paul DuBois
No changelog entry needed.

Description: When executing the regular DDL/MDL RQG workload (without transactions and with autocommit=on in order to prevent legitimate hangs due to open transactions), mysqld deadlocked with all threads in MDL and one thread as follows: 3 0x00000000009cfd51 in wait_for_lock (wait=0x276bea0, data=0x2768528, in_wait_list=0 '\0') at thr_lock.c:448 #4 0x00000000009d0932 in thr_lock (data=0x2768528, owner=0x277d8a8, lock_type=TL_WRITE) at thr_lock.c:785 #5 0x00000000009d1450 in thr_multi_lock (data=0x26a9288, count=1, owner=0x277d8a8) at thr_lock.c:1054 #6 0x0000000000628752 in mysql_lock_tables (thd=0x277cb08, tables=0x27794b0, count=1, flags=0, need_reopen=0x7fd4bd471feb) at lock.cc:320 #7 0x000000000069514e in lock_tables (thd=0x277cb08, tables=0x2778908, count=1, flags=0, need_reopen=0x7fd4bd471feb) at sql_base.cc:5435 #8 0x000000000069cc02 in open_and_lock_tables_derived (thd=0x277cb08, tables=0x2778908, derived=true, flags=0, prelocking_strategy=0x7fd4bd472030) at sql_base.cc:5117 #9 0x000000000064cc2e in open_and_lock_tables_derived (thd=0x277cb08, tables=0x2778908, derived=true, flags=0) at mysql_priv.h:1561 #10 0x000000000064cc69 in open_and_lock_tables (thd=0x277cb08, tables=0x2778908) at mysql_priv.h:1571 #11 0x0000000000645202 in mysql_execute_command (thd=0x277cb08) at sql_parse.cc:3101 #12 0x0000000000649bc2 in mysql_parse (thd=0x277cb08, inBuf=0x2778258 "DELETE A FROM testdb_N . t1_base1_N AS A WHERE `pk` BETWEEN 4 AND 4 + 1", length=74, found_semicolon=0x7fd4bd473ee0) at sql_parse.cc:5586 #13 0x000000000064a7db in dispatch_command (command=COM_QUERY, thd=0x277cb08, packet=0x277f779 "DELETE A FROM testdb_N . t1_base1_N AS A WHERE `pk` BETWEEN 4 AND 4 + 1", packet_length=74) at sql_parse.cc:1020 #14 0x000000000064bc3a in do_command (thd=0x277cb08) at sql_parse.cc:706 ---Type <return> to continue, or q <return> to quit--- #15 0x000000000063a6f0 in do_handle_one_connection (thd_arg=0x277cb08) at sql_connect.cc:1172 #16 0x000000000063a7af in handle_one_connection (arg=0x277cb08) at sql_connect.cc:1112 #17 0x000000315b0073da in start_thread () from /lib64/libpthread.so.0 #18 0x000000315a4e627d in clone () from /lib64/libc.so.6 How to repeat: Core and binary will be uploaded shortly.