Description:
We have noticed multiple slave hung issues on mysql 5.7.30 version.
The sql thread hung in "System lock" state while executing "XA Start" command;Other sql threads are waiting for this thread,Status is "Waiting for preceding transaction to commit";
And stopping replication slave using "stop slave" didnt work as the mysql is waiting for the replication sql_thread to exit.
The only solution is to restart mysqld instance.
mysql> select * from information_schema.processlist where command!='Sleep' order by time;
+-------+-------------+-----------------+--------------------+---------+--------+---------------------------------------------------+-----------------------------------------------------------------------------------+
| ID | USER | HOST | DB | COMMAND | TIME | STATE | INFO |
+-------+-------------+-----------------+--------------------+---------+--------+---------------------------------------------------+-----------------------------------------------------------------------------------+
| 10297 | root | 127.0.0.1:48130 | performance_schema | Query | 0 | executing | select * from information_schema.processlist where command!='Sleep' order by time |
| 38 | system user | | NULL | Connect | 5173 | Waiting for preceding transaction to commit | NULL |
| 36 | system user | | NULL | Connect | 5173 | Waiting for preceding transaction to commit | NULL |
| 31 | system user | | NULL | Connect | 5173 | Waiting for preceding transaction to commit | NULL |
| 28 | system user | | NULL | Connect | 5173 | Waiting for preceding transaction to commit | NULL |
| 29 | system user | | NULL | Connect | 5173 | Waiting for preceding transaction to commit | NULL |
| 27 | system user | | NULL | Connect | 5173 | Waiting for preceding transaction to commit | NULL |
| 25 | system user | | NULL | Connect | 5173 | Waiting for preceding transaction to commit | NULL |
| 26 | system user | | NULL | Connect | 5173 | System lock | NULL |
| 23 | system user | | NULL | Connect | 5173 | Waiting for preceding transaction to commit | NULL |
| 24 | system user | | NULL | Connect | 5173 | Waiting for preceding transaction to commit | NULL |
| 22 | system user | | NULL | Connect | 5174 | Waiting for slave workers to process their queues | NULL |
| 39 | system user | | NULL | Connect | 933155 | Queueing master event to the relay log | NULL |
+-------+-------------+-----------------+--------------------+---------+--------+---------------------------------------------------+-----------------------------------------------------------------------------------+
19 rows in set (0.00 sec)
The following is the pstack information of the sql thread in question:
# cat 82840.pstack
Thread 1 (process 82840):
#0 0x00007fe2cd5cbeed in nanosleep () from /lib64/libpthread.so.0
#1 0x00000000011991b0 in os_thread_sleep (tm=<optimized out>) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/storage/innobase/os/os0thread.cc:287
#2 0x000000000112dd8b in wait (trx=0x7fe2c0ddb750) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/storage/innobase/include/trx0trx.h:1572
#3 TrxInInnoDB::enter (trx=0x7fe2c0ddb750, disable=<optimized out>) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/storage/innobase/include/trx0trx.h:1492
#4 0x000000000111feb6 in begin_stmt (trx=0x7fe2c0ddb750) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/storage/innobase/include/trx0trx.h:1423
#5 ha_innobase::external_lock (this=0x7fc84caacdb0, thd=0x7fc84c0008c0, lock_type=<optimized out>) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/storage/innobase/handler/ha_innodb.cc:15786
#6 0x000000000084deda in handler::ha_external_lock (this=0x7fc84caacdb0, thd=0x7fc84c0008c0, lock_type=1) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/sql/handler.cc:8009
#7 0x0000000000e8297c in lock_external (count=1, tables=<optimized out>, thd=0x7fc84c0008c0) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/sql/lock.cc:396
#8 mysql_lock_tables (thd=0x7fc84c0008c0, tables=0x7fc84c005d48, count=1, flags=0) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/sql/lock.cc:332
#9 0x0000000000cf3595 in lock_tables (thd=0x7fc84c0008c0, tables=0x7fc84cb5e620, count=<optimized out>, flags=<optimized out>) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/sql/sql_base.cc:6762
#10 0x0000000000cfc26a in open_and_lock_tables (thd=0x7fc84c0008c0, tables=0x7fc84cb5e620, flags=0, prelocking_strategy=<optimized out>) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/sql/sql_base.cc:6580
#11 0x0000000000f140ee in open_and_lock_tables (flags=0, tables=<optimized out>, thd=<optimized out>) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/sql/sql_base.h:486
#12 Rows_log_event::do_apply_event (this=0x7fc85c091780, rli=0x7fc85c1586a0) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/sql/log_event.cc:11009
#13 0x0000000000f68e05 in slave_worker_exec_job_group (worker=0x7fc85c1586a0, rli=0xacf49230) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/sql/rpl_rli_pdb.cc:2673
#14 0x0000000000f5430b in handle_slave_worker (arg=0x7fc85c1586a0) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/sql/rpl_slave.cc:6239
#15 0x00000000013f6034 in pfs_spawn_thread (arg=0x7fc85c15c360) at /export/home/pb2/build/sb_0-38465026-1584987238.22/mysql-5.7.30/storage/perfschema/pfs.cc:2197
#16 0x00007fe2cd5c4dd5 in start_thread () from /lib64/libpthread.so.0
#17 0x00007fe2cc07eb3d in clone () from /lib64/libc.so.6
How to repeat:
None.
Suggested fix:
Restart mysqld.