Bug #62614 | Server deadlock between RESET MASTER and PURGE BINARY LOGS | ||
---|---|---|---|
Submitted: | 4 Oct 2011 8:46 | Modified: | 4 Jul 2014 6:41 |
Reporter: | Kristian Nielsen | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
Version: | 5.5.17, 5.6, 5.7 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[4 Oct 2011 8:46]
Kristian Nielsen
[4 Oct 2011 9:48]
Kristian Nielsen
It seems this was introduced with this commit: revision-id: sven.sandberg@oracle.com-20110526105043-wykkmpktyye2kav6 parent: sergey.glukhov@oracle.com-20110526100639-o3tcac0chgy9m9ij committer: Sven Sandberg <sven.sandberg@oracle.com> branch nick: 5.1 timestamp: Thu 2011-05-26 12:50:43 +0200 message: BUG#12574820: binlog.binlog_tmp_table timing out in daily and weekly trunk run Problem: MYSQL_BIN_LOG::reset_logs acquires mutexes in wrong order. The correct order is first LOCK_thread_count and then LOCK_log. This function does it the other way around. This leads to deadlock when run in parallel with a thread that takes the two locks in correct order. For example, a thread that disconnects will take the locks in the correct order. Fix: change order of the locks in MYSQL_BIN_LOG::reset_logs: first LOCK_thread_count and then LOCK_log. But I think it is wrong that LOCK_thread_count should be locked before LOCK_index/LOCK_log. The LOCK_thread_count is a short-lived lock, holding it blocks _all_ threads from starting a new query. LOCK_log is a long log, it is held while doing I/O. I could not find where a disconnecting thread will lock LOCK_log while holding LOCK_thread_count, however if there is such a place I would suggest the correct fix is to switch the locking order in the disconnecting thread. In any case there are more places where LOCK_index is taken before LOCK_thread_count. For example MYSQL_BIN_LOG::purge_logs() which calls log_in_use(); the former holds LOCK_index and the latter takes LOCK_thread_count.
[12 Sep 2013 18:53]
MySQL Verification Team
Still happens on todays 5.7.3 code: Thread 1: reset master Thread 2: purge master logs before now() Thread 1 waits for thread 2 Thread 2 waits for thread 1 Thread 1: --------- Owns LOCK_thread_count Owns LOCK_log Waits for LOCK_index ntdll.dll!ZwWaitForSingleObject ntdll.dll!RtlpWaitOnCriticalSection ntdll.dll!RtlEnterCriticalSection mysqld-debug.exe!inline_mysql_mutex_lock mysqld-debug.exe!MYSQL_BIN_LOG::reset_logs mysqld-debug.exe!reset_master mysqld-debug.exe!reload_acl_and_cache mysqld-debug.exe!mysql_execute_command mysqld-debug.exe!mysql_parse mysqld-debug.exe!dispatch_command mysqld-debug.exe!do_command mysqld-debug.exe!handle_connection mysqld-debug.exe!pfs_spawn_thread mysqld-debug.exe!pthread_start mysqld-debug.exe!_callthreadstartex mysqld-debug.exe!_threadstartex Thread 2: --------- Owns LOCK_index Waits for LOCK_thread_count ntdll.dll!ZwWaitForSingleObject ntdll.dll!RtlpWaitOnCriticalSection ntdll.dll!RtlEnterCriticalSection mysqld-debug.exe!inline_mysql_mutex_lock mysqld-debug.exe!log_in_use mysqld-debug.exe!MYSQL_BIN_LOG::purge_logs_before_date mysqld-debug.exe!purge_master_logs_before_date mysqld-debug.exe!mysql_execute_command mysqld-debug.exe!mysql_parse mysqld-debug.exe!dispatch_command mysqld-debug.exe!do_command mysqld-debug.exe!handle_connection mysqld-debug.exe!pfs_spawn_thread mysqld-debug.exe!pthread_start mysqld-debug.exe!_callthreadstartex mysqld-debug.exe!_threadstartex
[12 Sep 2013 18:58]
MySQL Verification Team
cross referencing another deadlock problem: http://bugs.mysql.com/bug.php?id=69954
[4 Jul 2014 6:41]
Venkatesh Duggirala
This issue is fixed as part of bug#69954 (which is fixed in 5.5.39, 5.6.20, and 5.7.5 )