Bug #92583 mysql hang
Submitted: 27 Sep 2018 5:59 Modified: 27 Sep 2018 11:42
Reporter: w yj Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.7.22 OS:CentOS (2.6.32-279.el6.x86_64)
Assigned to: CPU Architecture:x86

[27 Sep 2018 5:59] w yj
Description:
Suddenly hang.
no error log.
Kill mysqld recovery

i have pstack log

How to repeat:
i don't know
[27 Sep 2018 6:00] w yj
pstack info

Attachment: pstack_info.txt (text/plain), 541.86 KiB.

[27 Sep 2018 6:12] w yj
10:10:57|  2   1  97   0|    0    0|    0.0 1280.7     0.0  72040.8   3.4    2.6   0.1   7.1|
10:11:07|  2   1  96   0|    0    0|   12.5 3056.4    77.6  95945.2   0.8    0.3   0.0  11.1|
-------- ---cpu-usage--- ---swap--- -------------------------io-usage----------------------- 
  time  |usr sys idl iow|   si   so|   r/s    w/s    rkB/s    wkB/s  queue await svctm %util|
10:11:17|  1   3  95   1|    0    0|   11.9 5779.3    47.6 523779.0  37.3    6.4   0.1  39.0|
10:11:27|  3   3  94   1|    0    0|    5.717701.5    57.8 541036.4   4.1    0.2   0.0  56.5|
10:11:37|  4   3  92   1|    0    0|    0.020754.0     0.0 632366.4   4.7    0.2   0.0  64.9|
10:11:47|  3   2  94   1|    0    0|    0.021073.0     0.0 642639.7   4.8    0.2   0.0  61.5|
10:11:57|  3   2  94   1|    0    0|    5.916821.0   106.2 513410.4   4.1    0.2   0.0  51.1|
10:12:07|  3   2  94   1|    0    0|    0.819109.8    34.5 582149.9   3.9    0.2   0.0  57.0|
10:12:17|  3   3  93   1|    0    0|    0.020875.7     0.0 636692.9   5.0    0.2   0.0  64.7|
10:12:27|  3   2  94   1|    0    0|    0.217636.9    11.2 538500.6   4.1    0.2   0.0  53.3|
10:12:37|  3   2  94   1|    0    0|    0.021331.6     0.0 650478.8   4.2    0.2   0.0  63.7|
10:12:47|  3   2  93   1|    0    0|   87.921244.1  1406.6 647775.9   4.3    0.2   0.0  62.1|
10:12:57|  3   2  94   1|    0    0|    0.421261.7     5.2 648494.2   4.4    0.2   0.0  61.3|
10:13:07|  3   2  93   1|    0    0|    0.021251.3     0.0 648050.9   4.3    0.2   0.0  63.3|
10:13:17|  3   2  94   1|    0    0|    0.819389.0    12.8 591960.7   3.9    0.2   0.0  55.2|
10:13:27|  3   2  94   1|    0    0|    0.119410.0     1.6 592234.2   4.1    0.2   0.0  57.8|
10:13:37|  3   2  94   1|    0    0|    0.019434.3     0.0 593318.7   3.9    0.2   0.0  59.9|
-------- ---cpu-usage--- ---swap--- -------------------------io-usage----------------------- 
  time  |usr sys idl iow|   si   so|   r/s    w/s    rkB/s    wkB/s  queue await svctm %util|
10:13:47|  3   3  94   1|    0    0|    0.021266.5     0.0 648784.8   4.3    0.2   0.0  68.4|
10:13:57|  3   2  94   1|    0    0|    0.021324.3     0.0 650388.7   4.5    0.2   0.0  63.0|
10:14:07|  3   2  94   1|    0    0|    0.019352.2     0.0 590695.1   4.1    0.2   0.0  58.3|
10:14:17|  4   2  93   1|    0    0|    0.019288.0     0.0 588528.3   3.9    0.2   0.0  58.8|
10:14:27|  5   2  91   1|    0    0|    0.019327.2     0.0 589856.9   4.3    0.2   0.0  60.7|
10:14:37|  3   2  94   1|    0    0|    0.019260.8     0.0 587723.1   4.1    0.2   0.0  60.8|
10:14:47|  3   3  94   1|    0    0|    0.021247.2     0.0 648048.4   4.3    0.2   0.0  66.2|
10:14:57|  4   2  93   1|    0    0|    1.517411.6    81.4 531855.9   3.1    0.2   0.0  49.3|
10:15:07|  4   2  93   1|    0    0|    0.017559.4     0.0 536019.2   3.8    0.2   0.0  51.3|
10:15:17|  3   2  94   1|    0    0|    0.017369.7     0.0 530942.9   4.2    0.2   0.0  50.7|
10:15:27|  3   2  94   1|    0    0|    0.021007.3     0.0 640590.5   4.6    0.2   0.0  59.0|
10:15:37|  3   2  94   1|    0    0|    0.017061.4     0.0 520998.0   2.9    0.2   0.0  47.7|
10:15:47|  2   2  94   1|    0    0|    0.519546.8     3.2 596016.9   4.2    0.2   0.0  58.3|
10:15:57|  2   2  95   1|    0    0|    0.017985.1     0.0 549088.5   3.7    0.2   0.0  54.4|
10:16:07|  3   2  94   1|    0    0|    0.019629.1     0.0 598366.2   4.1    0.2   0.0  58.7|
-------- ---cpu-usage--- ---swap--- -------------------------io-usage----------------------- 
  time  |usr sys idl iow|   si   so|   r/s    w/s    rkB/s    wkB/s  queue await svctm %util|
10:16:17|  2   2  95   1|    0    0|    0.017465.7     0.0 532899.9   3.7    0.2   0.0  53.6|
10:16:27|  0   1  99   0|    0    0|    0.0  804.6     0.0  24294.3   0.2    0.2   0.0   2.3|
10:16:37|  0   1  99   0|    0    0|    0.0    5.3     0.0     44.0   0.0    1.8   1.4   0.7|
10:16:47|  0   0  99   0|    0    0|    0.0    0.8     0.0      3.6   0.0    0.0   0.0   0.0|
10:16:57|  0   1  99   0|    0    0|    0.0    0.7     0.0      4.8   0.0    0.0   0.0   0.0|

a lot of IO operations
[27 Sep 2018 6:34] w yj
./pt-pmp pstack_74812.txt 
    462 __lll_lock_wait(libpthread.so.0),_L_lock_995(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),THD::init,THD::THD,Channel_info::create_thd,Channel_info_tcpip_socket::create_thd,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
    234 __lll_lock_wait(libpthread.so.0),_L_lock_995(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),THD::init,THD::THD,Channel_info::create_thd,Channel_info_local_socket::create_thd,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
     18 libaio::??(libaio.so.1),LinuxAIOHandler::collect,LinuxAIOHandler::poll,os_aio_handler,fil_aio_wait,io_handler_thread,start_thread(libpthread.so.0),clone(libc.so.6)
     17 pthread_cond_wait,Stage_manager::enroll_for,MYSQL_BIN_LOG::change_stage,MYSQL_BIN_LOG::ordered_commit,MYSQL_BIN_LOG::commit,ha_commit_trans,trans_commit_stmt,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
     16 pthread_cond_wait,Stage_manager::enroll_for,MYSQL_BIN_LOG::change_stage,MYSQL_BIN_LOG::ordered_commit,MYSQL_BIN_LOG::commit,ha_commit_trans,trans_commit,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      6 poll(libc.so.6),vio_io_wait,vio_socket_io_wait,vio_read,net_read_raw_loop,net_read_packet,my_net_read,Protocol_classic::read_packet,Protocol_classic::get_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      3 pthread_cond_wait,os_event::wait_low,buf_flush_page_cleaner_worker,start_thread(libpthread.so.0),clone(libc.so.6)
      2 __lll_lock_wait(libpthread.so.0),_L_lock_995(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),inline_mysql_mutex_lock,alloc_and_copy_thd_dynamic_variables,thd_prepare_connection,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 sigwait(libpthread.so.0),signal_hand,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 sigwaitinfo(libc.so.6),timer_notify_thread_func,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_wait,os_event::wait_low,srv_purge_coordinator_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_wait,os_event::wait_low,buf_resize_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_wait,os_event::wait_low,buf_dump_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_wait,Event_queue::cond_wait,Event_queue::get_top_for_execution_if_time,Event_scheduler::run,event_scheduler_thread,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_wait,compress_gtid_table,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,srv_monitor_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,srv_error_monitor_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,lock_wait_timeout_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,ib_wqueue_timedwait,fts_optimize_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,dict_stats_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,buf_flush_page_cleaner_coordinator,start_thread(libpthread.so.0),clone(libc.so.6)
      1 poll(libc.so.6),Mysqld_socket_listener::listen_for_connection_event,mysqld_main,__libc_start_main(libc.so.6),_start
      1 nanosleep(libpthread.so.0),os_thread_sleep,srv_master_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 __lll_lock_wait(libpthread.so.0),_L_lock_995(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),Log_in_use::operator,Global_THD_manager::do_for_all_thd,MYSQL_BIN_LOG::purge_logs_before_date,MYSQL_BIN_LOG::ordered_commit,MYSQL_BIN_LOG::commit,ha_commit_trans,trans_commit,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 __lll_lock_wait(libpthread.so.0),_L_lock_995(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),Global_THD_manager::do_for_all_thd,PFS_connection_iterator::visit_global,PFS_status_variable_cache::do_materialize_global,table_global_status::rnd_init,ha_perfschema::rnd_init,handler::ha_rnd_init,init_read_record,join_init_read_record,sub_select,JOIN::exec,TABLE_LIST::materialize_derived,join_materialize_derived,QEP_TAB::prepare_scan,sub_select,JOIN::exec,handle_query,execute_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 __lll_lock_wait(libpthread.so.0),_L_lock_995(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),Fill_process_list::operator,Global_THD_manager::do_for_all_thd_copy,fill_schema_processlist,do_fill_table,get_schema_tables_result,JOIN::prepare_result,JOIN::exec,handle_query,execute_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 __lll_lock_wait(libpthread.so.0),_L_lock_854(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),show_binlogs,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 __lll_lock_wait(libpthread.so.0),_L_lock_854(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),PolyLock_lock_log::rdlock,sys_var::value_ptr,get_one_variable_ext,System_variable::init,PFS_system_variable_cache::do_materialize_all,table_session_variables::rnd_init,ha_perfschema::rnd_init,handler::ha_rnd_init,init_read_record,join_init_read_record,sub_select,JOIN::exec,TABLE_LIST::materialize_derived,join_materialize_derived,QEP_TAB::prepare_scan,sub_select,JOIN::exec,handle_query,execute_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 __lll_lock_wait(libpthread.so.0),_L_lock_854(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),inline_mysql_mutex_lock,MYSQL_BIN_LOG::change_stage,MYSQL_BIN_LOG::ordered_commit,MYSQL_BIN_LOG::commit,ha_commit_trans,trans_commit,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 __lll_lock_wait(libpthread.so.0),_L_lock_854(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),Binlog_sender::run,mysql_binlog_send,com_binlog_dump,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
[27 Sep 2018 9:20] w yj
there is no available workaround.
[27 Sep 2018 10:22] MySQL Verification Team
Thank you for the bug report. To process this bug report we need a repeatable test case, please comment here when you will able to provide it. Thanks in advance.
[27 Sep 2018 11:42] w yj
I guess it is a large transection(delete from table), so the binlog is also very large, causing mysql to pause.

-rw-r----- 1 apps apps 561M Sep 27 09:58 mysql-bin.066790
-rw-r----- 1 apps apps 3.9G Sep 27 10:11 mysql-bin.066791 --- Hanging time 
-rw-r----- 1 apps apps  154 Sep 27 10:32 mysql-bin.066792
-rw-r----- 1 apps apps 3.6G Sep 27 10:47 mysql-bin.066793
-rw-r----- 1 apps apps 561M Sep 27 10:55 mysql-bin.066794

-rw-r----- 1 apps apps 4.0G Sep 24 10:10 mysql-bin.066516 --- purge operation,A total of 500GB
-rw-r----- 1 apps apps 531M Sep 24 10:20 mysql-bin.066517
-rw-r----- 1 apps apps 594M Sep 24 10:22 mysql-bin.066518

And each time a 4GB binlog is generated, mha occasionally alerts once (mysql_connect_timeout=1)
MHA logs
Thu Sep 27 14:29:58 2018 - [warning] Got error on MySQL connect ping: DBI connect(';host=;port=;mysql_connect_timeout=1','mha_manager',...) failed: Can't connect to MySQL server on '' (4) at /usr/local/share/perl5/MHA/HealthCheck.pm line 97
Thu Sep 27 15:09:43 2018 - [warning] Got error on MySQL connect ping: DBI connect(';host=;port=;mysql_connect_timeout=1','mha_manager',...) failed: Can't connect to MySQL server on '' (4) at /usr/local/share/perl5/MHA/HealthCheck.pm line 97
Thu Sep 27 15:28:53 2018 - [warning] Got error on MySQL connect ping: DBI connect(';host=;port=;mysql_connect_timeout=1','mha_manager',...) failed: Can't connect to MySQL server on '' (4) at /usr/local/share/perl5/MHA/HealthCheck.pm line 97
Thu Sep 27 15:49:23 2018 - [warning] Got error on MySQL connect ping: DBI connect(';host=;port=;mysql_connect_timeout=1','mha_manager',...) failed: Can't connect to MySQL server on '' (4) at /usr/local/share/perl5/MHA/HealthCheck.pm line 97
Thu Sep 27 16:50:08 2018 - [warning] Got error on MySQL connect ping: DBI connect(';host=;port=;mysql_connect_timeout=1','mha_manager',...) failed: Can't connect to MySQL server on '' (4) at /usr/local/share/perl5/MHA/HealthCheck.pm line 97
Thu Sep 27 17:10:23 2018 - [warning] Got error on MySQL connect ping: DBI connect(';host=;port=;mysql_connect_timeout=1','mha_manager',...) failed: Can't connect to MySQL server on '' (4) at /usr/local/share/perl5/MHA/HealthCheck.pm line 97
Thu Sep 27 17:28:48 2018 - [warning] Got error on MySQL connect ping: DBI connect(';host=;port=;mysql_connect_timeout=1','mha_manager',...) failed: Can't connect to MySQL server on '' (4) at /usr/local/share/perl5/MHA/HealthCheck.pm line 97
Thu Sep 27 17:55:43 2018 - [warning] Got error on MySQL connect ping: DBI connect(';host=;port=;mysql_connect_timeout=1','mha_manager',...) failed: Can't connect to MySQL server on '' (4) at /usr/local/share/perl5/MHA/HealthCheck.pm line 97
Thu Sep 27 19:31:19 2018 - [warning] Got error on MySQL connect ping: DBI connect(';host=;port=;mysql_connect_timeout=1','mha_manager',...) failed: Can't connect to MySQL server on '' (4) at /usr/local/share/perl5/MHA/HealthCheck.pm line 97

binlog list
-rw-r----- 1 apps apps 3.9G Sep 27 14:11 mysql-bin.066820
-rw-r----- 1 apps apps 3.9G Sep 27 14:30 mysql-bin.066822
-rw-r----- 1 apps apps 3.9G Sep 27 14:52 mysql-bin.066824
-rw-r----- 1 apps apps 3.6G Sep 27 15:09 mysql-bin.066825
-rw-r----- 1 apps apps 3.9G Sep 27 15:28 mysql-bin.066827
-rw-r----- 1 apps apps 3.9G Sep 27 15:49 mysql-bin.066829
-rw-r----- 1 apps apps 3.6G Sep 27 16:10 mysql-bin.066830
-rw-r----- 1 apps apps 3.9G Sep 27 16:29 mysql-bin.066832
-rw-r----- 1 apps apps 3.9G Sep 27 16:50 mysql-bin.066834
-rw-r----- 1 apps apps 3.6G Sep 27 17:10 mysql-bin.066836
-rw-r----- 1 apps apps 3.9G Sep 27 17:28 mysql-bin.066838
-rw-r----- 1 apps apps 3.9G Sep 27 17:55 mysql-bin.066844
-rw-r----- 1 apps apps 3.6G Sep 27 18:10 mysql-bin.066845
-rw-r----- 1 apps apps 3.9G Sep 27 18:32 mysql-bin.066847
-rw-r----- 1 apps apps 3.9G Sep 27 18:49 mysql-bin.066849
-rw-r----- 1 apps apps 3.6G Sep 27 19:11 mysql-bin.066850
-rw-r----- 1 apps apps 3.9G Sep 27 19:31 mysql-bin.066852
[29 Oct 2018 7:54] Dongchan Sung
It looks like a similar bug as below.
https://bugs.mysql.com/bug.php?id=92108

I expect to fixed on 5.7.25.