Bug #57474 Deadlock when DDL under LOCK TABLES WRITE, READ + PREPARE.
Submitted: 15 Oct 2010 7:46 Modified: 23 Sep 2014 22:20
Reporter: Dmitry Lenev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:5.5.7-bzr,5.6.2m5 OS:Any
Assigned to: CPU Architecture:Any
Triage: Triaged: D1 (Critical)

[15 Oct 2010 7:46] Dmitry Lenev
Description:
A deadlock can occur under the following (fairly rare) conditions:
- A DDL statement like ALTER or DROP table is executed under LOCK TABLES
- This LOCK TABLES statement in addition to table to be changed by DDL also has some tables read-locked or locked implicitly e.g. through triggers.
- There is some concurrent DML which is blocked on table-level lock held by LOCK TABLES (i.e. because of the read-lock or implicit lock). This DML also uses some other tables.
- There is a concurrent PREPARE (or corresponding C/... API call) which prepares statement using tables to be affected by the first DDL and some other table which is affected by another DDL statement which has to wait for the DML statement.

Some parts of this deadlock involve waiting on thr_lock.c locks so they are not visible to deadlock detector of MDL subsystem and it cannot be properly detected by it.

How to repeat:
# Here is the script for mysqltest tool which allows to reproduce the problem.

connect (con1,localhost,root,,test,,);
connect (con2,localhost,root,,test,,);
connect (con3,localhost,root,,test,,);
connection default;

create table t1(i int);
create table t2(i int);
create table t3(i int);
create table t4(i int);

lock tables t1 write, t3 read;

connection con1;
--send insert into t3 values ((select count(*) from t4));

connection con2;
let $wait_condition=
  select count(*) = 1 from information_schema.processlist
  where state = "Waiting for table level lock" and
        info = "insert into t3 values ((select count(*) from t4))";
--source include/wait_condition.inc

--send rename table t2 to t0, t4 to t2, t0 to t4;

connection con3;
let $wait_condition=
  select count(*) = 1 from information_schema.processlist
  where state = "Waiting for table metadata lock" and
        info = "rename table t2 to t0, t4 to t2, t0 to t4";
--source include/wait_condition.inc

--send prepare stmt1 from 'select * from t1, t2';

connection default;
let $wait_condition=
  select count(*) = 1 from information_schema.processlist
  where state = "Waiting for table metadata lock" and
        info = "select * from t1, t2";
--source include/wait_condition.inc

alter table t1 add column j int;
[15 Oct 2010 8:32] Valeriy Kravchuk
Verified with current mysql-5.5 from bzr:

openxs@ubuntu:~/dbs/5.5/mysql-test$ ./mtr bug57474
Logging: ./mtr  bug57474
101015 11:03:17 [Note] Plugin 'FEDERATED' is disabled.
MySQL Version 5.5.7
Checking supported features...
 - skipping ndbcluster, mysqld not compiled with ndbcluster
 - SSL connections supported
 - binaries are debug compiled
Collecting tests...
vardir: /home/openxs/dbs/5.5/mysql-test/var
Removing old var directory...
Creating var directory '/home/openxs/dbs/5.5/mysql-test/var'...
Installing system database...
Using server port 52690

==============================================================================

TEST                                      RESULT   TIME (ms)
------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
worker[1] Trying to dump core for [mysqltest - pid: 2494, winpid: 2494]
worker[1] Trying to dump core for [mysqld.1 - pid: 2473, winpid: 2473]
main.bug57474                            [ fail ]  timeout after 900 seconds
        Test ended at 2010-10-15 11:19:25

Test case timeout after 900 seconds

== /home/openxs/dbs/5.5/mysql-test/var/log/bug57474.log == 
create table t1(i int);
create table t2(i int);
create table t3(i int);
create table t4(i int);
lock tables t1 write, t3 read;
insert into t3 values ((select count(*) from t4));;
rename table t2 to t0, t4 to t2, t0 to t4;;
prepare stmt1 from 'select * from t1, t2';;

 == /home/openxs/dbs/5.5/mysql-test/var/tmp/analyze-timeout-mysqld.1.err ==
mysqltest: Could not open connection 'default' after 500 attempts: 2002 Can't connect to local MySQL server through socket '/home/openxs/dbs/5.5/mysql-test/var/tmp/mysqld.1.sock' (111)

 - saving '/home/openxs/dbs/5.5/mysql-test/var/log/main.bug57474/' to '/home/openxs/dbs/5.5/mysql-test/var/log/main.bug57474/'
 - found 'core' (0/5)

Trying 'dbx' to get a backtrace

Trying 'gdb' to get a backtrace
Core generated by '/home/openxs/dbs/5.5/libexec/mysqld'
Output from gdb follows. The first stack trace is from the failing thread.
The following stack traces are from all threads (so the failing one is
duplicated).
--------------------------
[New Thread 2474]
[New Thread 2477]
[New Thread 2478]
[New Thread 2479]
[New Thread 2480]
[New Thread 2481]
[New Thread 2482]
[New Thread 2484]
[New Thread 2485]
[New Thread 2486]
[New Thread 2487]
[New Thread 2488]
[New Thread 2498]
[New Thread 2499]
[New Thread 2500]
[New Thread 2501]

warning: Can't read pathname for load map: Input/output error.
Core was generated by `/home/openxs/dbs/5.5/libexec/mysqld --defaults-group-suffix=.1 --defaults-file='.
Program terminated with signal 6, Aborted.
#0  0x00e42422 in __kernel_vsyscall ()
#0  0x00e42422 in __kernel_vsyscall ()
#1  0x002f7093 in __pthread_kill (threadid=3079231168, signo=6)
    at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:64
#2  0x0863a8b9 in my_write_core (sig=6) at stacktrace.c:328
#3  0x0827c074 in handle_segfault (sig=6) at mysqld.cc:2517
#4  <signal handler called>
#5  0x00e42422 in __kernel_vsyscall ()
#6  0x003c4b86 in *__GI___poll (fds=0x45aff4, nfds=2, timeout=-1)
    at ../sysdeps/unix/sysv/linux/poll.c:87
#7  0x0827ff0f in handle_connections_sockets () at mysqld.cc:5111
#8  0x0827f405 in mysqld_main (argc=50, argv=0x9dbed98) at mysqld.cc:4581
#9  0x0827874f in main (argc=8, argv=0xbf9119d4) at main.cc:24

Thread 16 (Thread 2501):
#0  0x00e42422 in __kernel_vsyscall ()
#1  0x002f6342 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_timedwait.S:179
#2  0x08646e32 in safe_cond_timedwait (cond=0xa36b7bc, mp=0xa36b778, 
    abstime=0xb248736c, file=0x87cf200 "../include/mysql/psi/mysql_thread.h", 
    line=1097) at thr_mutex.c:278
#3  0x084a05dc in inline_mysql_cond_timedwait (that=0xa36b7bc, 
    mutex=0xa36b778, abstime=0xb248736c, src_file=0x87cf367 "mdl.cc", 
    src_line=995) at ../include/mysql/psi/mysql_thread.h:1097
#4  0x084a12f0 in MDL_wait::timed_wait (this=0xa36b778, thd=0xa36b710, 
    abs_timeout=0xb248736c, set_status_on_timeout=true, 
    wait_state_name=0x87cf2f0 "Waiting for table metadata lock") at mdl.cc:995
#5  0x084a2257 in MDL_context::acquire_lock (this=0xa36b778, 
    mdl_request=0xb24873c4, lock_wait_timeout=31536000) at mdl.cc:1796
#6  0x082e06e7 in open_table_get_mdl_lock (thd=0xa36b710, ot_ctx=0xb2487a78, 
    mdl_request=0xb24873c4, flags=512, mdl_ticket=0xb24875f0)
    at sql_base.cc:2505
#7  0x082e10ab in open_table (thd=0xa36b710, table_list=0xa3a41c8, 
    mem_root=0xb2487a58, ot_ctx=0xb2487a78) at sql_base.cc:2811
#8  0x082e3a04 in open_and_process_table (thd=0xa36b710, lex=0xa3a3230, 
    tables=0xa3a41c8, counter=0xb2487b08, flags=512, 
    prelocking_strategy=0xb2487b0c, has_prelocking_list=false, 
    ot_ctx=0xb2487a78, new_frm_mem=0xb2487a58) at sql_base.cc:4347
#9  0x082e44fb in open_tables (thd=0xa36b710, start=0xb2487b24, 
    counter=0xb2487b08, flags=512, prelocking_strategy=0xb2487b0c)
    at sql_base.cc:4776
#10 0x082e530c in open_normal_and_derived_tables (thd=0xa36b710, 
    tables=0xa3a3e00, flags=512) at sql_base.cc:5421
#11 0x08338600 in mysql_test_select (stmt=0xa3602a8, tables=0xa3a3e00)
    at sql_prepare.cc:1474
#12 0x0833945a in check_prepared_statement (stmt=0xa3602a8)
    at sql_prepare.cc:2004
#13 0x0833bd10 in Prepared_statement::prepare (this=0xa3602a8, 
    packet=0xa3978d8 "select * from t1, t2", packet_len=20)
    at sql_prepare.cc:3252
#14 0x08339d92 in mysql_sql_stmt_prepare (thd=0xa36b710) at sql_prepare.cc:2367
#15 0x0828d684 in mysql_execute_command (thd=0xa36b710) at sql_parse.cc:2180
#16 0x08296ea5 in mysql_parse (thd=0xa36b710, 
    rawbuf=0xa397818 "prepare stmt1 from 'select * from t1, t2'", length=41, 
    parser_state=0xb2488b08) at sql_parse.cc:5594
#17 0x0828b10f in dispatch_command (command=COM_QUERY, thd=0xa36b710, 
    packet=0xa38e7d1 "prepare stmt1 from 'select * from t1, t2';", 
    packet_length=42) at sql_parse.cc:1139
#18 0x0828a547 in do_command (thd=0xa36b710) at sql_parse.cc:811
#19 0x082887b1 in do_handle_one_connection (thd_arg=0xa36b710)
    at sql_connect.cc:1192
#20 0x08288611 in handle_one_connection (arg=0xa36b710) at sql_connect.cc:1131
#21 0x086122ac in pfs_spawn_thread (arg=0xa384ff0) at pfs.cc:1015
#22 0x002f196e in start_thread (arg=0xb2489b70) at pthread_create.c:300
#23 0x003d2a4e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

...

------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 979 seconds executing testcases
[28 Jan 2011 21:07] Maxim Volkov
I have the exact same issue on 5.5.8.
[14 Mar 2011 22:28] Roel Van de Paar
Hi Dmitri,

This is the closest bug I could find to describe what I am/have been seeing a number of times now: I run a number of threads (let's say 30) with DML(/DLL) going on. Soon, many threads end up in 'Waiting for table metadata lock' state. I grabbed a SHOW PROCESSLIST/SHOW ENGINE INNODB STATUS/SHOW FULL PROCESSLIST/GLOBAL VARS/GLOBAL STATUS output to show what is happening. This seems a rather serious locking failure. Have a look at the output and let me know what you think. The issue is fairly easy to reproduce.
[14 Mar 2011 22:29] Roel Van de Paar
SHOW PROCESSLIST/SHOW ENGINE INNODB STATUS/SHOW FULL PROCESSLIST/GLOBAL VARS/GLOBAL STATUS

Attachment: out.txt (text/plain), 199.28 KiB.

[14 Mar 2011 22:34] Roel Van de Paar
(attached) gdb capture of all threads

Attachment: gdb.zip (application/x-zip-compressed, text), 100.88 KiB.

[14 Mar 2011 23:38] Roel Van de Paar
In the example, the only threads *not* in 'Waiting for table metadata lock' are:

Thread 3  (executing)    is using tables t2 t8 t9 t17 t27 t22 
Thread 6  (Sending data) is using tables t1 t5 t7 t16 t18 t25 t29
Thread 29 (Sending data) is using tables t1 t7 t3 t10 t18 t21 t23

Yet, for instance thread 13 is 'Waiting for table metadata lock' while the table it is accessing (t6) is not in use by any of the active threads.

     Id: 13
   Time: 3320
  State: Waiting for table metadata lock
   Info: insert ignore into `test`.`t6` set `col52` = -1.3764409700144E+18, `col53` = '2011-04-17 04:48:22', `a6` = 0, `col55` = 14038542, `col54` = '2011-10-25 15:35:45'

(There are however other threads attempting to access t6, but those are also in the 'Waiting for table metadata lock' state.)

So, at least for t6 it seems like there is some incorrect (and undetected) deadlock where all relevant threads are in the 'Waiting for table metadata lock' state.
[14 Mar 2011 23:48] Roel Van de Paar
(There are no foreign keys in the table definitions)
[14 Mar 2011 23:51] Roel Van de Paar
Could this view (the only one which is referecing both t6 and t5) be the issue?

CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v_4` AS select `t5`.`col43` AS `col43` from (`t6` FORCE INDEX (PRIMARY) FORCE INDEX (`idx25`) FORCE INDEX (`idx28`) left join (`t4` join `t5`) on(isnu
ll(`t4`.`col31`))) limit 1;
[15 Mar 2011 0:08] Roel Van de Paar
(All tables are InnoDB)
[15 Mar 2011 2:18] Roel Van de Paar
Interestingly, killing some of the active processes, I had a situation where most of those threads all went to 'opening tables', just before most of them reverted to 'Waiting for table metadata lock' again. So, it seems that the lock situation happens *in* or *directly after* the 'opening tables' code.
[15 Mar 2011 6:33] Dmitry Lenev
Hello Roel!

After looking at SHOW PROCESSLIST output and stacktrace that you have provided I think what you observe is a separate bug and not bug #57474. In cases when one encounters bug #57474 there is always at least one connection waiting with status "Waiting for table level lock" and in thr_lock()/wait_for_lock() functions. Since there is no such connection in your case it must be a different issue.

Please report a separate bug about your problem!
[18 Oct 2011 15:53] Fran ois
Hi, I am experiencing a blocking problem after migrating from mysql 5.1 to mysql 5.5 and I think it might be related.

I am randomly blocked running mysql command from commandline generated by mysqldump.
That file contains following lines:
    LOCK TABLES `myTable` WRITE;
    INSERT INTO `myTable` VALUES ( ...);
    UNLOCK TABLES;
The command is sometimes blocked before locking the table and sometimes before inserting the data row (and sometimes not blocked at all).
Is it related? Is there a workaround?
[7 Mar 2012 14:04] Sveta Smirnova
See bug #64376 also: seems to be duplicate.
[23 Sep 2014 22:20] Paul Dubois
Changes were made for test suite. No changelog entry needed.
[26 Sep 2014 13:08] Paul Dubois
Correction; changelog entry is:

Execution of DDL statements under LOCK TABLES might have led to
deadlock if 1) this LOCK TABLES statement, in addition to the table 
to be changed by DDL, also had some tables read-locked or locked 
implicitly (for example, through triggers) and 2) there was some
concurrent DML which was blocked on a table-level lock held by LOCK
TABLES and 3) there was a concurrent PREPARE (or corresponding
connector API call) which prepared a statement using tables to be
affected by the first DDL and some other table which was affected by
another DDL statement which had to wait for the DML statement.
[26 Sep 2014 13:08] Paul Dubois
Fixed in 5.7.5.