MySQL Bugs: #103636: Slave hangs with slave_preserve_commit

Bug #103636	Slave hangs with slave_preserve_commit_order On
Submitted:	10 May 2021 7:23	Modified:	12 Nov 2021 16:02
Reporter:	zhai weixiang (OCA)	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S3 (Non-critical)
Version:	8.0.24	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
Recently we are testing MySQL8.0.24 and noticed slave hangs with the following backtrace:

64 pthread_cond_timedwait,native_cond_timedwait(thr_cond.h:100),my_cond_timedwait(thr_cond.h:100),inline_mysql_cond_timedwait(thr_cond.h:100),MDL_wait::timed_wait(thr_cond.h:100),Commit_order_manager::wait_on_graph(rpl_slave_commit_order_manager.cc:108),Commit_order_manager::wait(rpl_slave_commit_order_manager.cc:153),Commit_order_manager::wait(rpl_slave_commit_order_manager.cc:369),MYSQL_BIN_LOG::ordered_commit(binlog.cc:8891),MYSQL_BIN_LOG::commit(binlog.cc:8296),ha_commit_trans(handler.cc:1814),trans_commit(transaction.cc:250),non-virtual,Xid_apply_log_event::do_apply_event_worker(log_event.cc:6486),slave_worker_exec_job_group(rpl_rli_pdb.cc:2526),handle_slave_worker(rpl_slave.cc:6191),pfs_spawn_thread(pfs.cc:2898),start_thread(libpthread.so.0),clone(libc.so.6)
      1 sched_yield(libc.so.6),Mts_submode_logical_clock::get_least_occupied_worker(rpl_mts_submode.cc:957),Log_event::get_slave_worker(log_event.cc:3000),Log_event::apply_event(log_event.cc:3539),apply_event_and_update_pos(rpl_slave.cc:4507),exec_relay_log_event(rpl_slave.cc:5067),handle_slave_sql(rpl_slave.cc:7372),pfs_spawn_thread(pfs.cc:2898),start_thread(libpthread.so.0),clone(libc.so.6)

I tried to kill sql thread, and it starts to wait for all worker threads to exit. But these worker threads still hang...

Then I tried to kill worker threads one by one, and during the process, the instance is crashed with following backtrace:

(MDL_context::visit_subgraph(MDL_wait_for_graph_visitor*)+0x52) [0x12bc762]
(Commit_order_manager::visit_lock_graph(Commit_order_lock_graph&, MDL_wait_for_graph_visitor&)+0x2ab) [0x1d0ff9b]
(MDL_context::visit_subgraph(MDL_wait_for_graph_visitor*)+0x108) [0x12bc818]
(MDL_context::find_deadlock()+0x72) [0x12bcc62]
(Commit_order_manager::wait_on_graph(Slave_worker*)+0x27a) [0x1d0f7aa]
(Commit_order_manager::wait(Slave_worker*)+0x4b) [0x1d0faeb]
(Commit_order_manager::wait_and_finish(THD*, bool)+0xb3) [0x1d0fca3]
(Slave_worker::slave_worker_ends_group(Log_event*, int)+0x3c8) [0x1ce6c98]
(slave_worker_exec_job_group(Slave_worker*, Relay_log_info*)+0x2fe) [0x1cedfee]
[0x1cf566b]
[0x2675371]

How to repeat:
Run sysbench read-write workload 7*24 hour

Primary:
binlog_transaction_dependency_tracking = WRITESET

Replica:
slave_parallel_type = LOGICAL_CLOCK
slave_parallel_workers = 64
slave_preserve_commit_order = 1

Suggested fix:
I don't know

Hi,

Have you maybe tried reproducing with 8.0.25?

I am running a test for longer than 24 hours with 8.0.25 and I'm not able to reproduce this. Please test with 8.0.25.

All best
Bogdan

I finally got a chance to gdb the stack by reducing timeout value.

Let's look at the function Commit_order_manager::finish_one

auto this_seq_nr{0};

so this_seq_nr is given a type int possiblely.

then let's check the gdb while problem happens:

(gdb) p this_seq_nr
$3 = -2147482990
(gdb) p next_seq_nr
$4 = -2147482989
(gdb) p sizeof(this_seq_nr)
$5 = 4
(gdb) p (unsigned int) this_seq_nr
$6 = 2147484306
(gdb) p (unsigned long long) this_seq_nr
$7 = 18446744071562068626
(gdb) p next_seq_nr
$8 = -2147482989
(gdb) p sizeof(next_seq_nr)
$9 = 4
(gdb) p  (unsigned long long) next_seq_nr
$10 = 18446744071562068627

while invoking this->m_workers[next_worker].freeze_commit_sequence_nr, next_seq_nr is transfered to unsigned long long, so it's not expected value and return false, the following worker will not be wakeup

I'll keep testing to verify my guest.

After running one day, the hang disappears. Note I used a very powerful machine, so the overflow of int can happen in 24 hours under heavy workload. 

The following patch may solve the problem:

diff --git a/sql/rpl_slave_commit_order_manager.cc b/sql/rpl_slave_commit_order_manager.cc
index afce898..54ec05e 100644
--- a/sql/rpl_slave_commit_order_manager.cc
+++ b/sql/rpl_slave_commit_order_manager.cc
@@ -267,10 +267,10 @@ void Commit_order_manager::finish_one(Slave_worker *worker) {
     assert(this->m_workers.front() == worker->id);
     assert(!this->m_workers.is_empty());
 
-    auto this_seq_nr{0};
+    cs::apply::Commit_order_queue::sequence_type this_seq_nr = 0;
     auto this_worker{cs::apply::Commit_order_queue::NO_WORKER};
     std::tie(this_worker, this_seq_nr) = this->m_workers.pop();
-    auto next_seq_nr = this_seq_nr + 1;
+    cs::apply::Commit_order_queue::sequence_type next_seq_nr = this_seq_nr + 1;
     assert(worker->id == this_worker);

Hi,

Thanks for the update to the report. I'm verifying it.

all best
Bogdan

Posted by developer:
 
Changelog entry added for MySQL 8.0.28:

If a replica server with the system variable replica_preserve_commit_order = 1 set was used under intensive load for a long period, the instance could run out of commit order sequence tickets. Incorrect behavior after the maximum value was exceeded caused the applier to hang and the applier worker threads to wait indefinitely on the commit order queue. The commit order sequence ticket generator now wraps around correctly. Thanks to Zhai Weixiang for the contribution.

is this Bug Fixed if so in which version can help with the details pls.
we hit this bug on version 8.0.26

Please use 8.0.33.