Bug #106999 | contributed by Tencent: shutdown hang due to binlog dump stick at wait_new_event | ||
---|---|---|---|
Submitted: | 13 Apr 2022 8:44 | Modified: | 14 Apr 2022 12:02 |
Reporter: | Xiaodong Huang (OCA) | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.7.37 | OS: | Any |
Assigned to: | MySQL Verification Team | CPU Architecture: | Any |
[13 Apr 2022 8:44]
Xiaodong Huang
[13 Apr 2022 14:26]
MySQL Verification Team
Hi, Thanks for the analysis and the potential fix but I cannot reproduce this with our binaries. Can you provide a reproducible test case ? thanks
[14 Apr 2022 8:51]
Xiaodong Huang
You can reproduce the problem with the following steps: 1. Apply the following patch into mysql-5.7.37 " From 93839aa93cefbd4e30ba72584fa10a84ce0dd379 Mon Sep 17 00:00:00 2001 From: dcthxdhuang <dcthxdhuang@tencent.com> Date: Thu, 14 Apr 2022 16:34:17 +0800 Subject: [PATCH] Reproduce the shutdown hang problem --- ...n_hang_due_to_dump_thread_stick-master.opt | 2 + ...wn_hang_due_to_dump_thread_stick-slave.opt | 2 + ...hutdown_hang_due_to_dump_thread_stick.test | 49 +++++++++++++++++++ sql/mysqld.cc | 11 +++++ sql/rpl_binlog_sender.cc | 9 ++++ 5 files changed, 73 insertions(+) create mode 100644 mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick-master.opt create mode 100644 mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick-slave.opt create mode 100644 mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick.test diff --git a/mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick-master.opt b/mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick-master.opt new file mode 100644 index 00000000000..3b71a6eace3 --- /dev/null +++ b/mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick-master.opt @@ -0,0 +1,2 @@ +--enforce-gtid-consistency=ON +--gtid-mode=ON diff --git a/mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick-slave.opt b/mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick-slave.opt new file mode 100644 index 00000000000..3b71a6eace3 --- /dev/null +++ b/mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick-slave.opt @@ -0,0 +1,2 @@ +--enforce-gtid-consistency=ON +--gtid-mode=ON diff --git a/mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick.test b/mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick.test new file mode 100644 index 00000000000..5ff61f9a85f --- /dev/null +++ b/mysql-test/suite/rpl/t/rpl_shutdown_hang_due_to_dump_thread_stick.test @@ -0,0 +1,49 @@ + +--source include/have_binlog_format_row.inc +--source include/have_gtid.inc +--source include/master-slave.inc +--source include/have_debug.inc +--source include/have_debug_sync.inc + + +--echo # +--echo # Prepare +--echo # +--source include/rpl_connection_master.inc +--let $MYSQL_SOCKET= `SELECT @@socket` +--let $MYSQL_PORT= `SELECT @@port` + +create table t1 (a varchar(2048)); +--source include/sync_slave_sql_with_master.inc +show tables; + +--echo # +--echo # Run +--echo # +--source include/rpl_connection_master.inc +set global debug="d,get_binlog_end_pos_before_flush_net"; +show variables like "debug"; +set global debug="+d,close_connections_before_kill_binlog_dump"; +show variables like "debug"; +begin; +insert into t1 values(repeat("a", 1024)),(repeat("b", 1024)),(repeat("c", 1024)); +commit; +--sleep 1 + +--echo # Shutdown Server +--exec echo "wait" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect +--exec $MYSQLADMIN -uroot -S $MYSQL_SOCKET -P $MYSQL_PORT shutdown 2>&1 +--echo # Server1 is down +--source include/wait_until_disconnected.inc + +--echo # restart server 1. +--let $rpl_server_number=1 +--source include/rpl_start_server.inc + +--echo # +--echo # Cleanup +--echo # +--source include/rpl_connection_master.inc +DROP TABLE t1; +--source include/rpl_end.inc + diff --git a/sql/mysqld.cc b/sql/mysqld.cc index 61b328541a9..93009ab98cc 100644 --- a/sql/mysqld.cc +++ b/sql/mysqld.cc @@ -966,6 +966,17 @@ public: killing_thd->get_command() != COM_BINLOG_DUMP_GTID); };); } + DBUG_EXECUTE_IF("close_connections_before_kill_binlog_dump", + { + if (killing_thd->get_command() == COM_BINLOG_DUMP || + killing_thd->get_command() == COM_BINLOG_DUMP_GTID) + { + const char act[]= "now signal before_kill_binlog_dump"; + assert(!debug_sync_set_action(current_thd, + STRING_WITH_LEN(act))); + } + };); + mysql_mutex_lock(&killing_thd->LOCK_thd_data); killing_thd->killed= THD::KILL_CONNECTION; MYSQL_CALLBACK(Connection_handler_manager::event_functions, diff --git a/sql/rpl_binlog_sender.cc b/sql/rpl_binlog_sender.cc index ea7db604da1..9a61c00a8c0 100644 --- a/sql/rpl_binlog_sender.cc +++ b/sql/rpl_binlog_sender.cc @@ -523,6 +523,15 @@ inline my_off_t Binlog_sender::get_binlog_end_pos(IO_CACHE *log_cache) if (log_pos < end_pos) DBUG_RETURN(end_pos); + DBUG_EXECUTE_IF("get_binlog_end_pos_before_flush_net", + { + const char act[]= "now wait_for before_kill_binlog_dump"; + assert(!debug_sync_set_action(current_thd, + STRING_WITH_LEN(act))); + // wait 1s + my_sleep(1000000L); + };); + /* Some data may be in net buffer, it should be flushed before waiting */ if (!m_wait_new_events || flush_net()) DBUG_RETURN(1); -- 2.34.1.182.ge773545c7f " 2. Compile the code with cmake flags: -DWITH_DEBUG_SYNC=1 and -DWITH_DEBUG=1 3. Execute the testcase "rpl_shutdown_hang_due_to_dump_thread_stick.test" 4. You can see the Server wil keep hang.
[14 Apr 2022 10:23]
MySQL Verification Team
Hi, > 1. Apply the following patch into mysql-5.7.37 ... > 3. Execute the testcase "rpl_shutdown_hang_due_to_dump_thread_stick.test" This test case passes properly with MySQL server we create so the bug comes from stuff you add on top of our code. If you need us to inspect your code and fix bugs there please contact our MySQL Support team, I am sure they will be happy to help. Is there a test case that fails on MySQL binary generated by Oracle?
[14 Apr 2022 11:58]
Xiaodong Huang
0002-Reproduce-the-shutdown-hang-problem
Attachment: 0002-Reproduce-the-shutdown-hang-problem.patch (application/octet-stream, text), 4.71 KiB.
[14 Apr 2022 12:02]
Xiaodong Huang
The previous may have some problem. Please try again with "0002-Reproduce-the-shutdown-hang-problem" with same step, the result show the shutdown wii be fail.