Bug #48940 MDL deadlocks against mysql_rm_db
Submitted: 20 Nov 2009 11:51 Modified: 4 Sep 2010 15:29
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:6.0-codebase-bugfixing OS:Any
Assigned to: Jon Olav Hauglid CPU Architecture:Any
Tags: mdl

[20 Nov 2009 11:51] Philip Stoev
Description:
When executing a RQG run of Matthias' MDL grammar, mysqld deadlocked with all threads being in various MDL functions and one thread in mysql_rm_db()

#3  0x00000000011b024e in safe_mutex_lock (mp=0x1ab6b60, my_flags=0, file=0x14eb1ad "sql_db.cc", line=894) at thr_mutex.c:225
#4  0x0000000000b201f6 in mysql_rm_db (thd=0x4a470c8, db=0x4a41208 "testdb_N", if_exists=true, silent=false) at sql_db.cc:894
#5  0x000000000081778a in mysql_execute_command (thd=0x4a470c8) at sql_parse.cc:3761
#6  0x000000000082105c in mysql_parse (thd=0x4a470c8, inBuf=0x4a41150 "DROP DATABASE IF EXISTS testdb_N", length=32, found_semicolon=0x7f43b9a8bef0)
    at sql_parse.cc:5979
#7  0x0000000000822b32 in dispatch_command (command=COM_QUERY, thd=0x4a470c8, packet=0x4a51c39 "DROP DATABASE IF EXISTS testdb_N", packet_length=32)
    at sql_parse.cc:1076
#8  0x0000000000825b15 in do_command (thd=0x4a470c8) at sql_parse.cc:758
#9  0x00000000007fab46 in handle_one_connection (arg=0x4a470c8) at sql_connect.cc:1164
#10 0x000000315b0073da in start_thread () from /lib64/libpthread.so.0
#11 0x000000315a4e627d in clone () from /lib64/libc.so.6

Another thread that tries to drop the same database at the same time has hanged as follows:

#1  0x00000000011b1893 in safe_cond_timedwait (cond=0x1b98980, mp=0x1b988c0, abstime=0x7f43b9c0f4f0, file=0x151d4df "mdl.cc", line=942) at thr_mutex.c:477
#2  0x0000000000ce1b36 in MDL_context::acquire_exclusive_locks (this=0x4941cc0, mdl_requests=0x7f43b9c0f5b0) at mdl.cc:942
#3  0x00000000007ca337 in lock_table_names (thd=0x4941be8, table_list=0x4960ba0) at lock.cc:976
#4  0x0000000000b4d008 in mysql_rm_table_part2 (thd=0x4941be8, tables=0x4960ba0, if_exists=true, drop_temporary=false, drop_view=true, dont_log_query=true)
    at sql_table.cc:1901
#5  0x0000000000b1f97b in mysql_rm_known_files (thd=0x4941be8, dirp=0x4cc22d8, db=0x4960b90 "testdb_N", org_path=0x7f43b9c0fef0 "./testdb_N/", level=0,
    dropped_tables=0x7f43b9c10238) at sql_db.cc:1177
#6  0x0000000000b204f8 in mysql_rm_db (thd=0x4941be8, db=0x4960b90 "testdb_N", if_exists=true, silent=false) at sql_db.cc:938
#7  0x000000000081778a in mysql_execute_command (thd=0x4941be8) at sql_parse.cc:3761
#8  0x000000000082105c in mysql_parse (thd=0x4941be8, inBuf=0x4960ae0 "DROP SCHEMA IF EXISTS testdb_N", length=30, found_semicolon=0x7f43b9c11ef0)
    at sql_parse.cc:5979
#9  0x0000000000822b32 in dispatch_command (command=COM_QUERY, thd=0x4941be8, packet=0x49519e9 "DROP SCHEMA IF EXISTS testdb_N", packet_length=30)
    at sql_parse.cc:1076
#10 0x0000000000825b15 in do_command (thd=0x4941be8) at sql_parse.cc:758
#11 0x00000000007fab46 in handle_one_connection (arg=0x4941be8) at sql_connect.cc:1164
#12 0x000000315b0073da in start_thread () from /lib64/libpthread.so.0
#13 0x000000315a4e627d in clone () from /lib64/libc.so.6

How to repeat:
The following RQG command line may work

perl runall.pl --basedir=/build/bzr/6.0-codebase-bugfixing --grammar=conf/WL5004_sql.yy --threads=30 --queries=1M --duration=1800 --reporter=Shutdown --gendata=conf/WL5004_data.zz --mysqld=--innodb

however this bug is set to Verified with the expectation that backtraces alone would be sufficient to figure out what is going on. If this is not the case, please set the bug back to "Need feedback".
[20 Nov 2009 11:52] Philip Stoev
bug48940.threads.txt

Attachment: bug48940.threads.txt (text/plain), 75.61 KiB.

[25 Nov 2009 11:13] Philip Stoev
Here is a core and a binary that show this deadlock, even though it is with ALTER and not with DROP.

http://mysql-systemqa.s3.amazonaws.com/var-bug48940.zip

code is from:

revision-id: kostja@sun.com-20091120224100-27np3zhj7o0d3w0b
date: 2009-11-21 01:41:00 +0300
build-date: 2009-11-25 12:58:38 +0200
revno: 3721
branch-nick: 6.0-codebase-bugfixing
[30 Nov 2009 15:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/92108

3740 Jon Olav Hauglid	2009-11-30
      Bug #48940 MDL deadlocks against mysql_rm_db
      
      This deadlock would occur between two connections A and B if statements
      where executed in the following way:
      1) Connection A executes a DML statement against table s1.t1 with
      autocommit off. This causes a shared metadata lock on s1.t1 to be 
      acquired.
      2) Connection B tries to DROP DATABASE s1. This will block against the
      metadata lock connection A holds on s1.t1. While blocking, connection B
      will hold the LOCK_mysql_create_db mutex.
      3) Connection A tries to ALTER DATABASE s1. This will block when trying
      to get LOCK_mysql_create_db mutex held by connection B.
      4) Deadlock!
      
      This patch fixes the problem by changing ALTER DATABASE to cause an
      implicit commit before executing. This will cause the metadata 
      lock on s1.t1 to be dropped, allowing DROP DATABASE to proceed. 
      This will in turn cause the LOCK_mysql_create_db mutex to be unlocked, 
      allowing ALTER DATABASE to proceed.
      
      Note that SQL commands other than ALTER DATABASE that also use 
      LOCK_mysql_create_db, already cause an implicit commit.
      
      Test case added to schema.test.
[15 Dec 2009 13:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/94139

3033 Jon Olav Hauglid	2009-12-15
      Bug #48940 MDL deadlocks against mysql_rm_db
      
      This deadlock would occur between two connections A and B if statements
      where executed in the following way:
      1) Connection A executes a DML statement against table s1.t1 with
      autocommit off. This causes a shared metadata lock on s1.t1 to be 
      acquired. (With autocommit on, the metadata lock will be dropped once
      the statment completes and the deadlock will not occour.)
      2) Connection B tries to DROP DATABASE s1. This will block against the
      metadata lock connection A holds on s1.t1. While blocking, connection B
      will hold the LOCK_mysql_create_db mutex.
      3) Connection A tries to ALTER DATABASE s1. This will block when trying
      to get LOCK_mysql_create_db mutex held by connection B.
      4) Deadlock between DROP DATABASE and ALTER DATABASE (which has autocommit
      off).
      
      If Connection A used an explicitly started transaction rather than having
      autocommit off, this deadlock did not happen as ALTER DATABASE is 
      disallowed inside transactions.
      
      This patch fixes the problem by changing ALTER DATABASE to cause an
      implicit commit before executing. This will cause the metadata 
      lock on s1.t1 to be dropped, allowing DROP DATABASE to proceed. 
      This will in turn cause the LOCK_mysql_create_db mutex to be unlocked, 
      allowing ALTER DATABASE to proceed.
      
      Note that SQL commands other than ALTER DATABASE that also use 
      LOCK_mysql_create_db, already cause an implicit commit. 
      
      Incompatible change: ALTER DATABASE (and its synonym ALTER SCHEMA)
      now cause an implicit commit. This must be reflected in the 
      documentation.
      
      Test case added to schema.test.
     @ sql/sql_parse.cc
        Added CF_AUTO_COMMIT_TRANS to SQLCOM_ALTER_DB.
        
        Removed thd->active_transaction() checks from SQLCOM_DROP_DB, 
        SQLCOM_ALTER_DB_UPGRADE and SQLCOM_ALTER_DB as these statements
        cause an implicit commit.
[15 Dec 2009 13:49] Jon Olav Hauglid
Pushed to mysql-next-4284 (5.6.0-beta) and merged to mysql-6.0-codebase-4284 (6.0.14-alpha).
[16 Feb 2010 16:48] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100216101445-2ofzkh48aq2e0e8o) (version source revid:jon.hauglid@sun.com-20100112185035-caoyjze52yh8dp7n) (merge vers: 6.0.14-alpha) (pib:16)
[16 Feb 2010 16:58] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100216101208-33qkfwdr0tep3pf2) (version source revid:jon.hauglid@sun.com-20100112151521-lq2d9pb893hz7oru) (pib:16)
[2 Mar 2010 1:01] Paul DuBois
Not present in any released version. No changelog entry needed.

Setting report to Need Merge pending push of Celosia into release tree.
[6 Mar 2010 10:53] Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20100216221947-luyhph0txl2c5tc8) (merge vers: 5.5.99-m3) (pib:16)
[7 Mar 2010 1:35] Paul DuBois
No changelog entry needed.
[25 Aug 2010 9:23] Bugs System
Pushed into mysql-5.5 5.5.6-m3 (revid:alik@ibmvm-20100825092002-2yvkb3iwu43ycpnm) (version source revid:alik@ibmvm-20100825092002-2yvkb3iwu43ycpnm) (merge vers: 5.5.6-m3) (pib:20)
[30 Aug 2010 8:31] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@sun.com-20100830082732-n2eyijnv86exc5ci) (version source revid:alik@sun.com-20100830082732-n2eyijnv86exc5ci) (merge vers: 5.6.1-m4) (pib:21)
[30 Aug 2010 8:35] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100830082745-n6sh01wlwh3itasv) (version source revid:alik@sun.com-20100830082745-n6sh01wlwh3itasv) (pib:21)