Bug #51134 Crash in MDL_lock::destroy on a concurrent DDL workload
Submitted: 12 Feb 2010 8:53 Modified: 7 Mar 2010 0:59
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:mysql-next-4284 OS:Any
Assigned to: Dmitry Lenev CPU Architecture:Any

[12 Feb 2010 8:53] Philip Stoev
Description:
When executing the full RQG DDL workload including HANDLER, mysqld crashed as follows:

#2  0x0000000000642639 in handle_segfault (sig=11) at mysqld.cc:2727
#3  <signal handler called>
#4  0x0000000002bbf4c0 in ?? ()
#5  0x000000000086eb2b in MDL_lock::destroy (lock=0x2bbf4d0) at mdl.cc:690
#6  0x000000000086d439 in MDL_map::remove (this=0x1069f00, lock=0x2bbf4d0) at mdl.cc:531
#7  0x000000000086d490 in MDL_lock::remove_ticket (this=0x2bbf4d0, list=&MDL_lock::m_granted, ticket=0x2bb8140) at mdl.cc:1064
#8  0x000000000086d5cf in MDL_context::release_lock (this=0x7fb14c09a848, ticket=0x2bb8140) at mdl.cc:1924
#9  0x000000000086e580 in MDL_context::acquire_locks (this=0x7fb14c09a848, mdl_requests=0x7fb152858fc0, lock_wait_timeout=1) at mdl.cc:1572
#10 0x0000000000637501 in lock_table_names (thd=0x7fb14c09a778, table_list=0x2be2db0) at lock.cc:987
#11 0x00000000007c465d in mysql_rename_tables (thd=0x7fb14c09a778, table_list=0x2be2db0, silent=false) at sql_rename.cc:137
#12 0x0000000000656ac0 in mysql_execute_command (thd=0x7fb14c09a778) at sql_parse.cc:2684
#13 0x000000000065c862 in mysql_parse (thd=0x7fb14c09a778,
    inBuf=0x2be2bb8 "RENAME TABLE testdb_N . t1_merge1_N  TO testdb_N . t1_merge1_N  , testdb_N . t1_temp1_N  TO testdb_S . t1_temp1_N", length=113,
    found_semicolon=0x7fb15285aee0) at sql_parse.cc:5581
#14 0x000000000065d47b in dispatch_command (command=COM_QUERY, thd=0x7fb14c09a778,
    packet=0x7fb14c0c1b99 "RENAME TABLE testdb_N . t1_merge1_N  TO testdb_N . t1_merge1_N  , testdb_N . t1_temp1_N  TO testdb_S . t1_temp1_N ",
    packet_length=114) at sql_parse.cc:1023
#15 0x000000000065e91b in do_command (thd=0x7fb14c09a778) at sql_parse.cc:709
#16 0x000000000064cafb in do_handle_one_connection (thd_arg=0x7fb14c09a778) at sql_connect.cc:1174
#17 0x000000000064cbca in handle_one_connection (arg=0x7fb14c09a778) at sql_connect.cc:1113
#18 0x000000315b0073da in start_thread () from /lib64/libpthread.so.0
#19 0x000000315a4e627d in clone () from /lib64/libc.so.6

How to repeat:
If this is repeatable, a simplified test case will be provided. In the meantime, the core and the binary will be made available.
[12 Feb 2010 9:29] Philip Stoev
Core and binary:

http://mysql-systemqa.s3.amazonaws.com/var-bug51134.zip

Source:

revision-id: dlenev@mysql.com-20100212070543-fu7ppkftm0g3sen1
date: 2010-02-12 10:05:43 +0300
build-date: 2010-02-12 11:29:49 +0200
revno: 3095
branch-nick: mysql-next-4284
[12 Feb 2010 10:21] Philip Stoev
Repeatable without HANDLER -- again on RENAME TABLE
[12 Feb 2010 20:29] Dmitry Lenev
This bug is repeatable with the following "simple" test case:

create table t3 (i int primary key);
connect (blocker, localhost, root, , );
connect (dml, localhost, root, , );
connect (ddl, localhost, root, , );

--echo # Test for RENAME TABLE
--echo # Switching to connection 'blocker'
connection blocker;
lock table t3 read;
--echo # Switching to connection 'ddl'
connection ddl;
let $ID= `select connection_id()`;
--send rename tables t1 to t2, t2 to t3;
--echo # Switching to connection 'default'
connection default;
let $wait_condition=
  select count(*) = 1 from information_schema.processlist
  where state = "Waiting for table" and info = "rename tables t1 to t2, t2 to t3";
--source include/wait_condition.inc
--replace_result $ID ID
eval kill query $ID;
--echo # Switching to connection 'ddl'
connection ddl;
--error ER_QUERY_INTERRUPTED
--reap
[14 Feb 2010 21:11] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/100316

3096 Dmitry Lenev	2010-02-15
      Fix for bug #51134 "Crash in MDL_lock::destroy on a concurrent 
      DDL workload".
      
      When a RENAME TABLE or LOCK TABLE ... WRITE statement which
      mentioned the same table several times were aborted during 
      the process of acquring metadata locks (due to deadlock 
      which was discovered or because of KILL statement) server 
      might have crashed.
      
      When attempt to acquire all locks requested had failed we
      went through the list of requests and released locks which
      we have managed to acquire by that moment one by one. Since 
      in the scenario described above list of requests contained 
      duplicates this led to releasing the same ticket twice and 
      a crash as result.
      
      This patch solves the problem by employing different approach
      to releasing locks in case of failure to acquire all locks
      requested. 
      Now we take a MDL savepoint before starting acquiring locks 
      and simply rollback to it if things go bad.
     @ mysql-test/r/lock_multi.result
        Updated test results (see lock_multi.test).
     @ mysql-test/t/lock_multi.test
        Added test case for bug #51134 "Crash in MDL_lock::destroy
        on a concurrent DDL workload".
     @ sql/mdl.cc
        MDL_context::acquire_locks():
          When attempt to acquire all locks requested has failed do
          not go through the list of requests and release locks which
          we have managed to acquire one by one. 
          Since list of requests can contain duplicates such approach
          may lead to releasing the same ticket twice and a crash as
          result.
          Instead use the following approach - take a MDL savepoint
          before starting acquiring locks and simply rollback to it
          if things go bad.
[15 Feb 2010 10:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/100336

3096 Dmitry Lenev	2010-02-15
      Fix for bug #51134 "Crash in MDL_lock::destroy on a concurrent 
      DDL workload".
      
      When a RENAME TABLE or LOCK TABLE ... WRITE statement which
      mentioned the same table several times were aborted during 
      the process of acquring metadata locks (due to deadlock 
      which was discovered or because of KILL statement) server 
      might have crashed.
      
      When attempt to acquire all locks requested had failed we
      went through the list of requests and released locks which
      we have managed to acquire by that moment one by one. Since 
      in the scenario described above list of requests contained 
      duplicates this led to releasing the same ticket twice and 
      a crash as result.
      
      This patch solves the problem by employing different approach
      to releasing locks in case of failure to acquire all locks
      requested. 
      Now we take a MDL savepoint before starting acquiring locks 
      and simply rollback to it if things go bad.
     @ mysql-test/r/lock_multi.result
        Updated test results (see lock_multi.test).
     @ mysql-test/t/lock_multi.test
        Added test case for bug #51134 "Crash in MDL_lock::destroy
        on a concurrent DDL workload".
     @ sql/mdl.cc
        MDL_context::acquire_locks():
          When attempt to acquire all locks requested has failed do
          not go through the list of requests and release locks which
          we have managed to acquire one by one. 
          Since list of requests can contain duplicates such approach
          may lead to releasing the same ticket twice and a crash as
          result.
          Instead use the following approach - take a MDL savepoint
          before starting acquiring locks and simply rollback to it
          if things go bad.
[15 Feb 2010 13:59] Dmitry Lenev
Fix for this bug was pushed into mysql-next-4284 tree. Since this problem is not repeatable outside of this non-public tree there is nothing to document. So I am simply closing this bug report.
[16 Feb 2010 16:47] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100216101445-2ofzkh48aq2e0e8o) (version source revid:alik@sun.com-20100215140849-b9fal65nwvrzczh4) (merge vers: 6.0.14-alpha) (pib:16)
[16 Feb 2010 16:56] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100216101208-33qkfwdr0tep3pf2) (version source revid:alik@sun.com-20100215120405-o1osx2k0nme27tx9) (pib:16)
[6 Mar 2010 11:02] Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20100216221947-luyhph0txl2c5tc8) (merge vers: 5.5.99-m3) (pib:16)
[7 Mar 2010 0:59] Paul DuBois
No changelog entry needed.