Bug #113727 | GTID Replica stalls completely when log_slave_updates=0 and a local DDL executed | ||
---|---|---|---|
Submitted: | 23 Jan 14:34 | Modified: | 26 Jan 6:38 |
Reporter: | Venkatesh Prasad Venugopal (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
Version: | 8.0.19+ | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | Contribution, log_slave_updats, MTS, wl7834 |
[23 Jan 14:34]
Venkatesh Prasad Venugopal
[23 Jan 14:49]
Venkatesh Prasad Venugopal
The above bug can result in two types of deadlocks based on the timing. 1. Node stall due to recursive locking 2. Node stall due to deadlock of locks affecting writes to server.
[23 Jan 14:49]
Venkatesh Prasad Venugopal
In case of recursive locking ---------------------------- We can observe that the thread hits the assertion `safe_mutex_assert_not_owner()` meaning owner thread tried to acquire the mutex once again. include/thr_mutex.h:159: void safe_mutex_assert_not_owner(safe_mutex_t*): Assertion `!mp->count || !my_thread_equal(my_thread_self(), mp->thread)' failed. 2023-12-05T11:24:38Z UTC - mysqld got signal 6 ; ... bin/mysqld(Mutex_cond_array::assert_not_owner(int) const+0x2f) [0x55d14af8b401] bin/mysqld(Mutex_cond_array::lock(int) const+0x24) [0x55d14af8b320] bin/mysqld(Gtid_state::lock_sidno(int)+0x28) [0x55d14af8b5d4] bin/mysqld(set_gtid_next(THD*, Gtid_specification const&)+0x47c) [0x55d14bb95679] bin/mysqld(Gtid_log_event::do_apply_event(Relay_log_info const*)+0x285) [0x55d14bb6eac1] bin/mysqld(Log_event::do_apply_event_worker(Slave_worker*)+0x88) [0x55d14bb796c8] bin/mysqld(Slave_worker::slave_worker_exec_event(Log_event*)+0x4ed) [0x55d14bc5967b] bin/mysqld(slave_worker_exec_job_group(Slave_worker*, Relay_log_info*)+0x488) [0x55d14bc5c4a2] bin/mysqld(+0x4dd03d4) [0x55d14bc793d4]
[23 Jan 14:49]
Venkatesh Prasad Venugopal
Stack trace ----------- 1. Replica applier thread is holding sid lock and is waiting for LOCK_commit in Gtid_state::update_gtids_impl_lock_sidnos() HOLDS: LOCK_commit WAITS_FOR: sid lock on sidno 2, held by thread 41 (gdb) bt #0 futex_wait (private=0, expected=2, futex_word=0x5592e2ba4658) at ../sysdeps/nptl/futex-internal.h:146 #1 __GI___lll_lock_wait (futex=futex@entry=0x5592e2ba4658, private=0) at ./nptl/lowlevellock.c:49 #2 0x00007f68ba898002 in lll_mutex_lock_optimized (mutex=0x5592e2ba4658) at ./nptl/pthread_mutex_lock.c:48 #3 ___pthread_mutex_lock (mutex=0x5592e2ba4658) at ./nptl/pthread_mutex_lock.c:93 #4 0x00005592dc4f16ea in native_mutex_lock (mutex=0x5592e2ba4658) at /home/venki/work/ps/80/include/thr_mutex.h:94 #5 safe_mutex_lock (mp=0x5592e2ba4630, try_lock=<optimized out>, file=0x5592dd5476d8 "/home/venki/work/ps/80/sql/rpl_gtid.h", line=926) at /home/venki/work/ps/80/mysys/thr_mutex.cc:117 #6 0x00005592dc23814e in Gtid_state::update_gtids_impl_lock_sidnos (this=this@entry=0x5592e2a1f880, first_thd=first_thd@entry=0x7f6808000bf0) at /home/venki/work/ps/80/sql/rpl_gtid_state.cc:863 #7 0x00005592dc24dc50 in Gtid_state::update_commit_group (this=0x5592e2a1f880, first_thd=0x7f6808000bf0) at /home/venki/work/ps/80/sql/rpl_gtid_state.cc:179 #8 0x00005592dc2df7ec in Commit_order_manager::flush_engine_and_signal_threads (this=<optimized out>, worker=<optimized out>) at /home/venki/work/ps/80/sql/rpl_replica_commit_order_manager.cc:242 #9 0x00005592dc2e3c6e in Commit_order_manager::finish (this=0x7f6814006b60, worker=0x7f681401fd00) at /home/venki/work/ps/80/sql/rpl_replica_commit_order_manager.cc:319 #10 0x00005592dc2e3e16 in Commit_order_manager::wait_and_finish (thd=0x7f6808000bf0, error=<optimized out>) at /home/venki/work/ps/80/sql/rpl_replica_commit_order_manager.cc:407 #11 0x00005592db71d2ef in ha_commit_low (thd=0x7f6808000bf0, all=true, run_after_commit=<optimized out>) at /home/venki/work/ps/80/sql/handler.cc:1980 #12 0x00005592dbb8a27c in trx_coordinator::commit_in_engines (thd=0x7f6808000bf0, all=<optimized out>, run_after_commit=<optimized out>) at /home/venki/work/ps/80/sql/tc_log.cc:148 #13 0x00005592dc1f90f4 in MYSQL_BIN_LOG::commit (this=0x5592dee50620 <mysql_bin_log>, thd=0x7f6808000bf0, all=<optimized out>) at /home/venki/work/ps/80/sql/binlog.cc:8500 #14 0x00005592db71de3f in ha_commit_trans (thd=0x7f6808000bf0, all=<optimized out>, ignore_global_read_lock=<optimized out>) at /home/venki/work/ps/80/sql/handler.cc:1810 #15 0x00005592dbb94e03 in trans_commit (thd=0x7f6808000bf0, ignore_global_read_lock=<optimized out>) at /home/venki/work/ps/80/sql/transaction.cc:246 #16 0x00005592dc20ad66 in Xid_log_event::do_commit (this=0x7f68141dc980, thd_arg=0x7f6808000bf0) at /home/venki/work/ps/80/sql/log_event.cc:6150 #17 0x00005592dc20b023 in Xid_apply_log_event::do_apply_event_worker (this=0x7f68141dc9f0, w=0x7f681401fd00) at /home/venki/work/ps/80/sql/log_event.cc:6245 #18 0x00005592dc2b04eb in Slave_worker::slave_worker_exec_event (this=0x7f681401fd00, ev=0x7f68141dc9f0) at /home/venki/work/ps/80/sql/rpl_rli_pdb.cc:1744 #19 0x00005592dc2c1e39 in slave_worker_exec_job_group (worker=0x7f681401fd00, rli=0x5592e2e7d6f0) at /home/venki/work/ps/80/sql/rpl_rli_pdb.cc:2498 #20 0x00005592dc2cab47 in handle_slave_worker (arg=arg@entry=0x7f681401fd00) at /home/venki/work/ps/80/sql/rpl_replica.cc:6138 #21 0x00005592dc9d1138 in pfs_spawn_thread (arg=0x5592e2c50720) at /home/venki/work/ps/80/storage/perfschema/pfs.cc:3043 #22 0x00007f68ba894ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 #23 0x00007f68ba926660 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 2. Connection thread executing CREATE TABLE LIKE query HOLDS: This thread holds the sid lock on sid no 2. WAITS_FOR: LOCK_commit in BGC, held by thread 34 (gdb) bt #0 futex_wait (private=0, expected=2, futex_word=0x5592e2a17618) at ../sysdeps/nptl/futex-internal.h:146 #1 __GI___lll_lock_wait (futex=futex@entry=0x5592e2a17618, private=0) at ./nptl/lowlevellock.c:49 #2 0x00007f68ba898167 in ___pthread_mutex_lock (mutex=0x5592e2a17618) at ./nptl/pthread_mutex_lock.c:145 #3 0x00005592dc4f16ea in native_mutex_lock (mutex=0x5592e2a17618) at /home/venki/work/ps/80/include/thr_mutex.h:94 #4 safe_mutex_lock (mp=0x5592e2a175f0, try_lock=<optimized out>, file=0x5592ddebc918 "/home/venki/work/ps/80/sql/rpl_commit_stage_manager.cc", line=378) at /home/venki/work/ps/80/mysys/thr_mutex.cc:117 #5 0x00005592dc21b123 in Commit_stage_manager::enroll_for (this=0x5592deec6280 <Commit_stage_manager::get_instance()::shared_instance>, stage=Commit_stage_manager::COMMIT_STAGE, thd=0x7f67f4129b10, stage_mutex=<optimized out>, enter_mutex=0x5592dee509c0 <mysql_bin_log+928>) at /home/venki/work/ps/80/sql/rpl_commit_stage_manager.cc:378 #6 0x00005592dc1d8f88 in MYSQL_BIN_LOG::change_stage (this=<optimized out>, thd=0x7f67f4129b10, stage=Commit_stage_manager::COMMIT_STAGE, queue=0x7f67f4129b10, leave_mutex=0x5592dee50a20 <mysql_bin_log+1024>, enter_mutex=0x5592dee509c0 <mysql_bin_log+928>) at /home/venki/work/ps/80/sql/binlog.cc:9027 #7 0x00005592dc1f7505 in MYSQL_BIN_LOG::ordered_commit (this=0x5592dee50620 <mysql_bin_log>, thd=0x7f67f4129b10, all=<optimized out>, skip_commit=<optimized out>) at /home/venki/work/ps/80/sql/binlog.cc:9456 #8 0x00005592dc1f990f in MYSQL_BIN_LOG::commit (this=0x5592dee50620 <mysql_bin_log>, thd=0x7f67f4129b10, all=<optimized out>) at /home/venki/work/ps/80/sql/binlog.cc:8704 #9 0x00005592db71de3f in ha_commit_trans (thd=0x7f67f4129b10, all=<optimized out>, ignore_global_read_lock=<optimized out>) at /home/venki/work/ps/80/sql/handler.cc:1810 #10 0x00005592dbb950c0 in trans_commit_implicit (thd=0x7f67f4129b10, ignore_global_read_lock=<optimized out>) at /home/venki/work/ps/80/sql/transaction.cc:339 #11 0x00005592dbb2cb14 in mysql_create_like_table (thd=0x7f67f4129b10, table=0x7f67f41c9ff8, src_table=<optimized out>, create_info=0x7f686c7f6510) at /home/venki/work/ps/80/sql/sql_table.cc:11349 #12 0x00005592db9fe77a in Sql_cmd_create_table::execute (this=0x7f67f41cb0d8, thd=0x7f67f4129b10) at /home/venki/work/ps/80/sql/sql_cmd_ddl_table.cc:430 #13 0x00005592dba63688 in mysql_execute_command (thd=<optimized out>, first_level=<optimized out>) at /home/venki/work/ps/80/sql/sql_parse.cc:4978 #14 0x00005592dba719b3 in dispatch_sql_command (thd=0x7f67f4129b10, parser_state=<optimized out>, update_userstat=<optimized out>) at /home/venki/work/ps/80/sql/sql_parse.cc:5635 #15 0x00005592dba60ec1 in dispatch_command (thd=0x7f67f4129b10, com_data=<optimized out>, command=COM_QUERY) at /home/venki/work/ps/80/sql/sql_parse.cc:2147 #16 0x00005592dba620ef in do_command (thd=0x7f67f4129b10) at /home/venki/work/ps/80/sql/sql_parse.cc:1501 #17 0x00005592dbbefb38 in handle_connection (arg=arg@entry=0x5592e2e9f1f0) at /home/venki/work/ps/80/sql/conn_handler/connection_handler_per_thread.cc:308 #18 0x00005592dc9d1138 in pfs_spawn_thread (arg=0x5592e2e64b70) at /home/venki/work/ps/80/storage/perfschema/pfs.cc:3043 #19 0x00007f68ba894ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 #20 0x00007f68ba926660 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
[26 Jan 6:27]
Venkatesh Prasad Venugopal
Contribution patch to fix the above issue based on 8.0.36 branch (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: 0001-Bug-113727-GTID-Replica-stalls-completely-when-log_s.patch (text/x-patch), 10.61 KiB.
[26 Jan 6:29]
Venkatesh Prasad Venugopal
How this deadlock happens? -------------------------- 0. Binlog is enabled on replica, but log_replica_updates is disabled. 1. Initially, both "Commit Order" and "Binlog Flush" queues are empty. 2. Replica applier thread 1 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier thread 1 3.1. Becomes leader (In Commit_stage_manager::enroll_for()). 3.2. Registers in the commit order queue. 3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log. 3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is not yet released. NOTE: SE commit for applier thread is already done by the time it reaches here. 4. Replica applier thread 2 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the applier thread 2 5.1. Becomes leader (In Commit_stage_manager::enroll_for()) 5.2. Registers in the commit order queue. 5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier thread 1 it will wait until the lock is released. 6. Client thread enters the group commit pipeline to register in the "Binlog Flush" queue. 7. Since "Commit Order" queue is not empty (there is applier thread 2 in the queue), it enters the conditional wait `m_stage_cond_leader` with an intention to become the leader for both the "Binlog Flush" and "Commit Order" queues. 8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update the GTID by calling gtid_state->update_commit_group() from Commit_order_manager::flush_engine_and_signal_threads(). 9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log. 9.1. It checks if there is any thread waiting in the "Binlog Flush" queue to become the leader. Here it finds the client thread waiting to be the leader. 9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the cond_var `m_stage_cond_leader` and enters a conditional wait until the thread's `tx_commit_pending` is set to false by the client thread (will be done in the Commit_stage_manager::process_final_stage_for_ordered_commit_group() called by client thread from fetch_and_process_flush_stage_queue()). 10. The client thread wakes up from the cond_var `m_stage_cond_leader`. The thread has now become a leader and it is its responsibility to update GTID of applier thread 2. 10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log. 10.2. Returns from `enroll_for()` and proceeds to process the "Commit Order" and "Binlog Flush" queues. 10.3. Fetches the "Commit Order" and "Binlog Flush" queues. 10.4. Performs the storage engine flush by calling ha_flush_logs() from fetch_and_process_flush_stage_queue(). 10.5. Proceeds to update the GTID of threads in "Commit Order" queue by calling gtid_state->update_commit_group() from Commit_stage_manager::process_final_stage_for_ordered_commit_group().
[26 Jan 6:29]
Venkatesh Prasad Venugopal
11. At this point, we will have - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and - Applier thread 1 performing GTID update for itself (from step 8). Due to the lack of proper synchronization between the above two threads, there exists a time window where both threads can call gtid_state->update_commit_group() concurrently. In subsequent steps, both threads simultaneously try to modify the contents of the array `commit_group_sidnos` which is used to track the lock status of sidnos. This concurrent access to `update_commit_group()` can cause a lock-leak resulting in one thread acquiring the sidno lock and not releasing at all. ----------------------------------------------------------------------------------------------------------- Client thread Applier Thread 1 ----------------------------------------------------------------------------------------------------------- update_commit_group() => global_sid_lock->rdlock(); update_commit_group() => global_sid_lock->rdlock(); calls update_gtids_impl_lock_sidnos() calls update_gtids_impl_lock_sidnos() set commit_group_sidno[2] = true set commit_group_sidno[2] = true lock_sidno(2) -> successful lock_sidno(2) -> waits update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { unlock_sidno(2); commit_group_sidnos[2] = false; } Applier thread continues.. lock_sidno(2) -> successful update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { <=== this check fails and lock is not released. unlock_sidno(2); commit_group_sidnos[2] = false; } Client thread continues without releasing the lock ----------------------------------------------------------------------------------------------------------- 12. As the above lock-leak can also happen the other way i.e, the applier thread fails to unlock, there can be different consequences hereafter. 13. If the client thread continues without releasing the lock, then at a later stage, it can enter into a deadlock with the applier thread performing a GTID update with stack trace. Client_thread ------------- #1 __GI___lll_lock_wait #2 ___pthread_mutex_lock #3 native_mutex_lock <= waits for commit lock while holding sidno lock #4 Commit_stage_manager::enroll_for #5 MYSQL_BIN_LOG::change_stage #6 MYSQL_BIN_LOG::ordered_commit #7 MYSQL_BIN_LOG::commit #8 ha_commit_trans #9 trans_commit_implicit #10 mysql_create_like_table #11 Sql_cmd_create_table::execute #12 mysql_execute_command #13 dispatch_sql_command Applier thread -------------- #1 ___pthread_mutex_lock #2 native_mutex_lock #3 safe_mutex_lock #4 Gtid_state::update_gtids_impl_lock_sidnos <= waits for sidno lock #5 Gtid_state::update_commit_group #6 Commit_order_manager::flush_engine_and_signal_threads <= acquires commit lock here #7 Commit_order_manager::finish #8 Commit_order_manager::wait_and_finish #9 ha_commit_low #10 trx_coordinator::commit_in_engines #11 MYSQL_BIN_LOG::commit #12 ha_commit_trans #13 trans_commit #14 Xid_log_event::do_commit #15 Xid_apply_log_event::do_apply_event_worker #16 Slave_worker::slave_worker_exec_event #17 slave_worker_exec_job_group #18 handle_slave_worker 14. If the applier thread continues without releasing the lock, then at a later stage, it can perform recursive locking while setting the GTID for the next transaction (in set_gtid_next()). In debug builds the above case hits the assertion `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the replica applier thread when it tries to re-acquire the lock. Solution -------- In the above problematic example, when seen from each thread individually, we can conclude that there is no problem in the order of lock acquisition, thus there is no need to change the lock order. However, the root cause for this problem is that multiple threads can concurrently access to the array `Gtid_state::commit_group_sidnos`. In its initial implementation, it was expected that threads should hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it was not considered when upstream implemented WL#7846 (MTS: slave-preserve-commit-order when log-slave-updates/binlog is disabled). With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired when the client thread (binlog flush leader) when it tries to perform GTID update on behalf of threads waiting in "Commit Order" queue, thus providing a guarantee that `Gtid_state::commit_group_sidnos` array is never accessed without the protection of `MYSQL_BIN_LOG::LOCK_commit`.
[26 Jan 6:38]
MySQL Verification Team
Hello Venu, Thank you for the report and contribution. regards, Umesh
[30 Jan 17:57]
OCA Admin
Contribution submitted via Github - Bug #113727: GTID Replica stalls completely when log_slave_updates=0 and a local (*) Contribution by Venkatesh Prasad Venugopal (Github venkatesh-prasad-v, mysql-server/pull/514#issuecomment-1911548388): I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: git_patch_1696682477.txt (text/plain), 10.60 KiB.