Bug #89247 | Deadlock with MTS when slave_preserve_commit_order = ON. | ||
---|---|---|---|
Submitted: | 15 Jan 2018 14:39 | Modified: | 1 Sep 2021 13:10 |
Reporter: | Jean-François Gagné | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 8.0.3, 5.7.20 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[15 Jan 2018 14:39]
Jean-François Gagné
[15 Jan 2018 14:39]
Jean-François Gagné
SHOW SLAVE STATUS
Attachment: sss.txt (text/plain), 2.80 KiB.
[15 Jan 2018 14:39]
Jean-François Gagné
SHOW FULL PROCESSLIST
Attachment: sfp.txt (text/plain), 7.19 KiB.
[15 Jan 2018 17:23]
Jean-François Gagné
I reproduced the problem on 5.7.20 (replicating from 8.0.3).
[16 Jan 2018 9:13]
Jean-François Gagné
The error message is saying: "Consider raising the value of the slave_transaction_retries variable", so I tried. With 32 workers and slave_transaction_retries = 50, there is no deadlock. However, the fact that there is a deadlock (no possibility to stop/start the slave) with 10 retries should still be addressed.
[19 Jan 2018 15:12]
MySQL Verification Team
Hi, I managed to reproduce this one but it's not easy on slow system. Looks like the faster your system is the easier this is to reproduce. I had to lower the retry number to reproduce faster, with higher retry number I can't reproduce. Verifying it thanks for the report Bogdan
[19 Jan 2018 15:14]
MySQL Verification Team
I don't really believe this should be S2 since increasing retry number works around it but I'll leave it as is and let replication team decide :) all best bogdan
[19 Jan 2018 15:36]
Jean-François Gagné
Great that you were able to reproduce Bogdan, I have to say that I was not expecting this to be verified so quickly: you rock ! About S2, because the way to get out of this is a "kill -9", I thought it was deserving more than S3. Thanks for you work on this, JFG
[19 Jan 2018 22:13]
MySQL Verification Team
Hi, I reproduced it rather quickly and easily (second try), maybe I was just lucky before the long weekend :D .. let's see how will it go again on Tuesday :) all best Bogdan
[21 Aug 2018 7:31]
Guangpu Feng
Verified on 5.7.22 in my.cnf: slave_parallel_type = LOGICAL_CLOCK slave_parallel_workers = 16 slave_preserve_commit_order = 1
[21 Aug 2018 9:07]
Guangpu Feng
I encountered this problem in MGR of version 5.7.22, and mysqld hangs whenever I issued 'stop grou_replication' or 'mysqladmin shutdown'. the only workaround is to 'kill -9' and then issue 'set global slave_parallel_workers=1' before 'start group_replication' ---- my.cnf ---- slave_parallel_type = LOGICAL_CLOCK slave_parallel_workers = 16 slave_preserve_commit_order = 1
[21 Aug 2018 16:19]
Guangpu Feng
pt-pmp info: 10 pthread_cond_wait,os_event_wait_low,os_aio_handler,fil_aio_wait,io_handler_thread,start_thread(libpthread.so.0),clone(libc.so.6) 10 pthread_cond_wait,Commit_order_manager::wait_for_its_turn,MYSQL_BIN_LOG::ordered_commit,MYSQL_BIN_LOG::commit,ha_commit_trans,trans_commit,Xid_log_event::do_commit,Xid_apply_log_event::do_apply_event_worker,slave_worker_exec_job_group,handle_slave_worker,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 3 pthread_cond_wait,Per_thread_connection_handler::block_until_new_connection,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 3 pthread_cond_wait,os_event_wait_low,srv_worker_thread,start_thread(libpthread.so.0),clone(libc.so.6) 3 pthread_cond_wait,os_event_wait_low,lock_wait_suspend_thread,row_mysql_handle_errors,::??,row_update_for_mysql,ha_innobase::update_row,handler::ha_update_row,Update_rows_log_event::do_exec_row,Rows_log_event::do_apply_row,Rows_log_event::do_index_scan_and_update,Rows_log_event::do_apply_event,Slave_worker::read_and_apply_events,Slave_worker::retry_transaction,slave_worker_exec_job_group,handle_slave_worker,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 3 pthread_cond_wait,os_event_wait_low,lock_wait_suspend_thread,row_mysql_handle_errors,::??,row_update_for_mysql,ha_innobase::update_row,handler::ha_update_row,Update_rows_log_event::do_exec_row,Rows_log_event::do_apply_row,Rows_log_event::do_index_scan_and_update,Rows_log_event::do_apply_event,slave_worker_exec_job_group,handle_slave_worker,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 3 pthread_cond_wait,os_event_wait_low,buf_flush_page_cleaner_worker,start_thread(libpthread.so.0),clone(libc.so.6) 1 sigwait(libpthread.so.0),signal_hand,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 sigwaitinfo(libc.so.6),::??,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 sched_yield(libc.so.6),Mts_submode_logical_clock::get_least_occupied_worker,Log_event::get_slave_worker,Log_event::apply_event,apply_event_and_update_pos,handle_slave_sql,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_event_wait_low,srv_purge_coordinator_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_event_wait_low,buf_resize_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_event_wait_low,buf_dump_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,Gcs_xcom_engine::process(group_replication.so),process_notification_thread(group_replication.so),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,compress_gtid_table,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,Applier_module::applier_thread_handle(group_replication.so),group_replication::??(group_replication.so),pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,srv_monitor_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,srv_error_monitor_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,lock_wait_timeout_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,ib_wqueue_timedwait,fts_optimize_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,dict_stats_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,buf_flush_page_cleaner_coordinator,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,Certifier_broadcast_thread::dispatcher(group_replication.so),group_replication::??(group_replication.so),pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 poll(libc.so.6),Mysqld_socket_listener::listen_for_connection_event,mysqld_main,__libc_start_main(libc.so.6),_start 1 poll(libc.so.6),group_replication::??(group_replication.so),task_loop(group_replication.so),xcom_taskmain2(group_replication.so),Gcs_xcom_proxy_impl::xcom_init(group_replication.so),group_replication::??(group_replication.so),start_thread(libpthread.so.0),clone(libc.so.6) 1 nanosleep(libpthread.so.0),os_thread_sleep,srv_master_thread,start_thread(libpthread.so.0),clone(libc.so.6)
[9 Feb 2020 8:26]
Simon Mudd
Also seen in 5.7.28.
[30 Apr 2020 12:32]
Simon Mudd
Also seen in 5.7.29
[24 Sep 2020 7:48]
Venkatesh Prasad Venugopal
Looks like all the below bugs are duplicates of each other. Bug #87796 Commit_order_manager can't terminate MTS worker properly when deadlock happens Bug #89247 Deadlock with MTS when slave_preserve_commit_order = ON. Bug #95249 stop slave permanently blocked Bug #99440 Threads of MTS Slave randomly stuck Here is the root cause for the bug. When --slave-preserve-commit-order is enabled on slave and if the waiting thread has locked the rows which are needed by the thread executing the previous transaction(as per their order in the relay log), then the innodb deadlock detection algorithm detects the deadlock between workers and will ask the waiting thread to rollback (only if its sequence number is lesser than that of the waiting thread). When this happens, the waiting thread wakes up from the cond_wait(SPCO) and it gets to know that it was asked to rollback by its preceding transaction as it was holding a lock that is needed by the other transaction to progress. It then rolls back its transaction so that the the preceding transaction can be committed and retries the transaction. The above logic sometimes caused the worker thread to miss the signals resulting in the replica server to hang. One of such hang is mentioned below. Consider a replica server which is configured with slave_parallel_workers=3, slave_parallel_type=LOGICAL_CLOCK, slave_preserve_commit_order=1 and slave_transaction_retries=0. When MTS is enabled, it is quite possible that workers execute out of order causing the below state. Worker 1 - Processing the events of Transaction T1 Worker 2 - Executed Transaction T2 and is waiting for T1 to commit. Worker 3 - Processing the events of Transaction T3 1. If T1 and T2 are modifying same rows in InnodB, then the worker 1 detects deadlock and asks worker 2 to rollback by signalling. 2. Worker 2 wakes up from the cond_wait. It gets to know that it was asked to roll back by the other transaction and returns with an error. 3. Worker 2 rolls back the transaction and comes to the retry part of the code and checks the value of slave_transaction_retries. Since it is 0, it returns from the handle_slave_worker loop and enters the error handling part. 4. As part of error handling, Worker 2 notifies the co-ordinator that it is exiting, and then calls report_rollback() function to unregister itself from the SPCO queue. 5. While executing report_rollback(), Worker 2 will again enter wait_for_its_turn(). But before entering the wait, it checks the commit_order_deadlock flag. Since the flag is already set, Worker2 immediately returns from the function with error. 6. Co-ordinator thread gets this information and sets the rli->abort_slave=1 to stop replication and waits till all workers exit. 7. Worker 2 exits. There is no worker 2 from here onwards. Now the status is, Worker 1 - Processing the events of Transaction T1 Worker 2 - Not running. Worker 3 - Processing the events of Transaction T3 8. Now the worker 1 proceeds and executes the transaction and enters the Commit_order_manager::wait_for_its_turn. 9. Worker 1 finds out that the previous worker(Worker 2) failed because of an error. 10. Worker 1 signals next transaction/worker to proceed. 11. Worker 3 executes the transaction and enters the Commit_order_manager::wait_for_its_turn. 12. Worker 1 rolls back the transaction and eventually exits. 13. There will be no one to signal Worker 3 and thus waits forever. This resulted in a system hang as the co-ordinator thread will be waiting for the worker thread to finish and the worker thread will be waiting for the signal to proceed with the commit. mysql> show processlist; +----+-------------+-----------------+------+---------+------+-----------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | +----+-------------+-----------------+------+---------+------+-----------------------------------------+ | 2 | root | localhost:55708 | test | Query | 0 | starting | show processlist | 0 | 0 | | 3 | system user | | NULL | Connect | 107 | Waiting for master to send event | NULL | 0 | 0 | | 4 | system user | | NULL | Connect | 77 | Waiting for workers to exit | NULL | 0 | 0 | | 7 | system user | | NULL | Connect | 84 | Waiting for preceding transaction to commit | NULL | 0 | 0 | +----+-------------+-----------------+------+---------+------+-----------------------------------------+
[24 Sep 2020 7:54]
Venkatesh Prasad Venugopal
Contributed patch on top of 5.7.31 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: 0001-MTS-deadlock-with-slave_preserve_commit_order.patch (text/x-patch), 37.06 KiB.
[29 Sep 2020 5:20]
Venkatesh Prasad Venugopal
Contributed patch on top of 8.0.21 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: 0001-MTS-deadlock-with-slave_preserve_commit_order-8.0.patch (text/x-patch), 37.04 KiB.
[1 Sep 2021 13:10]
MySQL Verification Team
Internal bug is closed as duplicate of Bug #87796, thus closing this as a duplicate of Bug #87796. regards, Umesh