Bug #56676 'show slave status' ,'show global status' hang when 'stop slave' takes too long
Submitted: 9 Sep 2010 8:52 Modified: 10 Jan 2013 11:12
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.1.50 OS:Any
Assigned to: Assigned Account CPU Architecture:Any

[9 Sep 2010 8:52] Shane Bester
Description:
'show slave status' and 'show global status' should never hang.  Here we have a situation where it hangs because the 'stop slave' has locked mutex LOCK_active_mi and gone to sleep...

| Command | Time | State                              | Info                               
+---------+------+------------------------------------+----
| Query   |   11 | Killing slave                      | stop slave                         
| Connect |  276 | Reconnecting after a failed master | NULL                               
| Connect |  275 | Has read all relay log; waiting fo | NULL                               
| Query   |   11 | NULL                               | show slave status                  
| Query   |   11 | NULL                               | show slave status                  
| Query   |   11 | NULL                               | show slave status

STOP SLAVE:                                         
ntdll.dll!_ZwWaitForMultipleObjects      
kernel32.dll!_WaitForMultipleObjectsEx	 
kernel32.dll!_WaitForMultipleObjects     
mysqld-debug.exe!pthread_cond_timedwait  
mysqld-debug.exe!terminate_slave_thread  
mysqld-debug.exe!terminate_slave_threads 
mysqld-debug.exe!stop_slave              
mysqld-debug.exe!mysql_execute_command   
mysqld-debug.exe!mysql_parse             
mysqld-debug.exe!dispatch_command        
mysqld-debug.exe!do_command              
mysqld-debug.exe!handle_one_connection   
mysqld-debug.exe!pthread_start           
mysqld-debug.exe!_callthreadstart        
mysqld-debug.exe!_threadstart            
kernel32.dll!_BaseThreadStart            

SHOW SLAVE STATUS:
ntdll.dll!_ZwWaitForSingleObject        
ntdll.dll!_RtlpWaitOnCriticalSection    
ntdll.dll!_RtlEnterCriticalSection      
mysqld-debug.exe!mysql_execute_command  
mysqld-debug.exe!mysql_parse            
mysqld-debug.exe!dispatch_command       
mysqld-debug.exe!do_command             
mysqld-debug.exe!handle_one_connection  
mysqld-debug.exe!pthread_start          
mysqld-debug.exe!_callthreadstart       
mysqld-debug.exe!_threadstart           
kernel32.dll!_BaseThreadStart

SHOW GLOBAL STATUS:                         
ntdll.dll!_ZwWaitForSingleObject            
ntdll.dll!_RtlpWaitOnCriticalSection        
ntdll.dll!_RtlEnterCriticalSection          
mysqld-debug.exe!show_slave_retried_trans   
mysqld-debug.exe!show_status_array          
mysqld-debug.exe!fill_status                
mysqld-debug.exe!get_schema_tables_result   
mysqld-debug.exe!JOIN::exec                 
mysqld-debug.exe!mysql_select               
mysqld-debug.exe!handle_select              
mysqld-debug.exe!execute_sqlcom_select      
mysqld-debug.exe!mysql_execute_command      
mysqld-debug.exe!mysql_parse                
mysqld-debug.exe!dispatch_command           
mysqld-debug.exe!do_command                 
mysqld-debug.exe!handle_one_connection      
mysqld-debug.exe!pthread_start              
           
Monitoring solutions such as MEM that poll every X seconds can cause all available connections to be consumed because none of them ever finish.  This is also a bug that can be worked around in the client (by dont sending another query if previous one didn't complete).

How to repeat:
send queries continuously 'show global status' and 'show slave status', 'stop slave','start slave' then shutdown the master server so slave must reconnect.

or study the code, and see the mutex is held.
[9 Sep 2010 8:58] MySQL Verification Team
it might be considered a feature request to improve this. it's annoying problem to deal with at the moment.  the real-life use case is when a long-running query is being run by the sql thread, 'stop slave' will hang, and this problem is apparent there.  see related support issue for example (email 4).
[9 Sep 2010 9:04] Andrei Elkin
STOP-SLAVE handler acquires at least two mutex:s starting from LOCK_active_mi. When it goes to cond_wait it actually releases the last one which is `mi->run_lock' and to keep LOCK_active_mi locked.
It requires investigation whether  LOCK_active_mi is really necessary to hold that long by STOP-SLAVE handler.
[29 Jun 2011 11:14] MySQL Verification Team
see bug #42930 for another show global status problem.
[10 Jan 2013 11:12] Erlend Dahl
Fixed in 5.7.0