Description:
Bug 69954 fix is incomplete. 5.5 code review shows several places where LOCK_thread_count is locked, a global thread list is looped over, and some LOCK_thd_data is locked. Unless it is provable that LOCK_thd_data will not be taken for a zombie dump thread there, it's a deadlock.
How to repeat:
One such place is close_connections of mysqld.cc.
To see the deadlock (MTR is not of GA quality, but it will show deadlock all right):
=== modified file 'mysql-test/suite/rpl/t/rpl_4threads_deadlock.test'
--- mysql-test/suite/rpl/t/rpl_4threads_deadlock.test 2014-05-08 12:43:01 +0000
+++ mysql-test/suite/rpl/t/rpl_4threads_deadlock.test 2014-08-11 12:08:34 +0000
@@ -32,6 +32,8 @@
--source include/have_binlog_format_statement.inc
--source include/master-slave.inc
+--let $mysqld_pid_file=`SELECT @@GLOBAL.pid_file`
+
connect(master2,127.0.0.1,root,,test,$MASTER_MYPORT,);
connect(master3,127.0.0.1,root,,test,$MASTER_MYPORT,);
--connection master
@@ -43,7 +45,7 @@
# Iteration 1 with "SHOW PROCESSLIST command"
# Iteration 2 with "SELECT * FROM information_schema.processlist"
---let $iter=2
+--let $iter=3
while ($iter)
{
--connection slave
@@ -83,6 +85,16 @@
{
--let $query=SELECT * FROM information_schema.processlist
}
+ if ($iter == 3)
+ {
+ system kill -TERM `cat $mysqld_pid_file`;
+ --connection slave
+ select SLEEP(10000);
+# --source include/wait_until_disconnected.inc
+# --enable_reconnect
+# --exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+# --source include/wait_until_connected_again.inc
+ }
--send_eval $query
--connection master
Now run the testcase, give it some time, attach to the master server and here's the Thread 4 to use the bug 69954 commit message thread terminology:
thread #22: tid = 0x526e6d, 0x00007fff997cf746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff997cf746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff9607f779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x000000010bb1a9c8 mysqld`safe_mutex_lock(mp=0x00007fbedd0060b8, try_lock=<unavailable>, file=0x000000010bf9a929, line=1128) + 168 at thr_mutex.c:152
frame #3: 0x000000010b907af6 mysqld`kill_server_thread [inlined] inline_mysql_mutex_lock(st_mysql_mutex*, char const*, unsigned int) + 1382 at mysql_thread.h:615
frame #4: 0x000000010b907a72 mysqld`kill_server_thread [inlined] close_connections() + 1125 at mysqld.cc:1128
frame #5: 0x000000010b90760d mysqld`kill_server_thread [inlined] kill_server(sig_ptr=<unavailable>) + 100 at mysqld.cc:1325
frame #6: 0x000000010b9075a9 mysqld`kill_server_thread(arg=<unavailable>) + 25 at mysqld.cc:1353
frame #7: 0x000000010bc77b2f mysqld`pfs_spawn_thread(arg=0x00007fbedda024d0) + 79 at pfs.cc:1015
frame #8: 0x00007fff9607c899 libsystem_pthread.dylib`_pthread_body + 138
frame #9: 0x00007fff9607c72a libsystem_pthread.dylib`_pthread_start + 137
Thread 1:
thread #21: tid = 0x526e65, 0x00007fff997cf746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff997cf746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff9607f779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x000000010bb1a9c8 mysqld`safe_mutex_lock(mp=0x000000010c14cb08, try_lock=<unavailable>, file=0x000000010bfaec52, line=1591) + 168 at thr_mutex.c:152
frame #3: 0x000000010b99c9b2 mysqld`THD::awake(THD::killed_state) [inlined] inline_mysql_mutex_lock(st_mysql_mutex*, char const*, unsigned int) + 93 at mysql_thread.h:615
frame #4: 0x000000010b99c955 mysqld`THD::awake(this=0x00007fbedd005200, state_to_set=<unavailable>) + 549 at sql_class.cc:1591
frame #5: 0x000000010ba01bdc mysqld`kill_zombie_dump_threads(slave_server_id=<unavailable>) + 524 at sql_repl.cc:1381
frame #6: 0x000000010b9cd8a6 mysqld`dispatch_command(command=COM_SHUTDOWN, thd=0x00007fbede02b200, packet=0x00007fbede02e201, packet_length=<unavailable>) + 3462 at sql_parse.cc:1227
frame #7: 0x000000010b9cef68 mysqld`do_command(thd=0x00007fbede02b200) + 376 at sql_parse.cc:773
frame #8: 0x000000010baa4006 mysqld`do_handle_one_connection(thd_arg=<unavailable>) + 390 at sql_connect.cc:862
frame #9: 0x000000010baa3e65 mysqld`handle_one_connection(arg=0x00007fbede02b200) + 69 at sql_connect.cc:781
frame #10: 0x000000010bc77b2f mysqld`pfs_spawn_thread(arg=0x00007fbedda024d0) + 79 at pfs.cc:1015
frame #11: 0x00007fff9607c899 libsystem_pthread.dylib`_pthread_body + 138
frame #12: 0x00007fff9607c72a libsystem_pthread.dylib`_pthread_start + 137
Thread 2:
thread #18: tid = 0x526e49, 0x00007fff997cf746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff997cf746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff9607f779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x000000010bb1a9c8 mysqld`safe_mutex_lock(mp=0x000000010c14d030, try_lock=<unavailable>, file=0x000000010bfa4dcd, line=512) + 168 at thr_mutex.c:152
frame #3: 0x000000010ba0399e mysqld`show_binlogs(thd=0x00007fbedd087000) + 1022 at mysql_thread.h:611
frame #4: 0x000000010b9d2a6b mysqld`mysql_execute_command(thd=0x00007fbedd087000) + 9675 at sql_parse.cc:2713
frame #5: 0x000000010b9cf3f8 mysqld`mysql_parse(thd=0x00007fbedd087000, rawbuf=<unavailable>, length=<unavailable>, parser_state=<unavailable>) + 440 at sql_parse.cc:5662
frame #6: 0x000000010b9cdb12 mysqld`dispatch_command(command=COM_SHUTDOWN, thd=0x00007fbedd087000, packet=<unavailable>, packet_length=<unavailable>) + 4082 at sql_parse.cc:1038
frame #7: 0x000000010b9cef68 mysqld`do_command(thd=0x00007fbedd087000) + 376 at sql_parse.cc:773
frame #8: 0x000000010baa4006 mysqld`do_handle_one_connection(thd_arg=<unavailable>) + 390 at sql_connect.cc:862
frame #9: 0x000000010baa3e65 mysqld`handle_one_connection(arg=0x00007fbedd087000) + 69 at sql_connect.cc:781
frame #10: 0x000000010bc77b2f mysqld`pfs_spawn_thread(arg=0x00007fbedddc10f0) + 79 at pfs.cc:1015
frame #11: 0x00007fff9607c899 libsystem_pthread.dylib`_pthread_body + 138
frame #12: 0x00007fff9607c72a libsystem_pthread.dylib`_pthread_start + 137
Thread 3:
thread #19: tid = 0x526e4c, 0x00007fff997cf746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff997cf746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff9607f779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x000000010bb1a9c8 mysqld`safe_mutex_lock(mp=0x000000010c150110, try_lock=<unavailable>, file=0x000000010bfb7b5f, line=252) + 168 at thr_mutex.c:152
frame #3: 0x000000010b9fe33d mysqld`log_in_use(char const*) [inlined] inline_mysql_mutex_lock(st_mysql_mutex*, char const*, unsigned int) + 119 at mysql_thread.h:615
frame #4: 0x000000010b9fe2c6 mysqld`log_in_use(log_name=0x0000000110cac890) + 54 at sql_repl.cc:252
frame #5: 0x000000010b8dc23e mysqld`MYSQL_BIN_LOG::purge_logs_before_date(this=0x000000010c14cb00, purge_time=0) + 462 at log.cc:4101
frame #6: 0x000000010b9fe6d8 mysqld`purge_master_logs_before_date(thd=0x00007fbedd08e200, purge_time=<unavailable>) + 40 at sql_repl.cc:332
frame #7: 0x000000010b9d4d2a mysqld`mysql_execute_command(thd=0x00007fbedd08e200) + 18570 at sql_parse.cc:2239
frame #8: 0x000000010b9cf3f8 mysqld`mysql_parse(thd=0x00007fbedd08e200, rawbuf=<unavailable>, length=<unavailable>, parser_state=<unavailable>) + 440 at sql_parse.cc:5662
frame #9: 0x000000010b9cdb12 mysqld`dispatch_command(command=COM_SHUTDOWN, thd=0x00007fbedd08e200, packet=<unavailable>, packet_length=<unavailable>) + 4082 at sql_parse.cc:1038
frame #10: 0x000000010b9cef68 mysqld`do_command(thd=0x00007fbedd08e200) + 376 at sql_parse.cc:773
frame #11: 0x000000010baa4006 mysqld`do_handle_one_connection(thd_arg=<unavailable>) + 390 at sql_connect.cc:862
frame #12: 0x000000010baa3e65 mysqld`handle_one_connection(arg=0x00007fbedd08e200) + 69 at sql_connect.cc:781
frame #13: 0x000000010bc77b2f mysqld`pfs_spawn_thread(arg=0x00007fbedddc10f0) + 79 at pfs.cc:1015
frame #14: 0x00007fff9607c899 libsystem_pthread.dylib`_pthread_body + 138
frame #15: 0x00007fff9607c72a libsystem_pthread.dylib`_pthread_start + 137
Suggested fix:
Use LOCK_thd_remove in the affected code too.