Bug #103943 | Semi-Replication Rotate And Purge Relaylog Leads To Increased Response | ||
---|---|---|---|
Submitted: | 8 Jun 2021 11:42 | Modified: | 24 Aug 2021 1:50 |
Reporter: | chao gao | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
Version: | 5.7.33 | OS: | CentOS (7.6) |
Assigned to: | CPU Architecture: | x86 (64) |
[8 Jun 2021 11:42]
chao gao
[8 Jun 2021 11:49]
chao gao
maybe this reason? Replication: The value returned by a SHOW SLAVE STATUS statement for the total combined size of all existing relay log files (Relay_Log_Space) could become much larger than the actual disk space used by the relay log files. The I/O thread did not lock the variable while it updated the value, so the SQL thread could automatically delete a relay log file and write a reduced value before the I/O thread finished updating the value. The I/O thread then wrote its original size calculation, ignoring the SQL thread's update and so adding back the space for the deleted file. The Relay_Log_Space value is now locked during updates to prevent concurrent updates and ensure an accurate calculation. (Bug #26997096, Bug #87832) 5.7.24:rli->relay_log.harvest_bytes_written(&rli->log_space_total); 5.7.25:rli->relay_log.harvest_bytes_written(rli, true/*need_log_space_lock=true*/); when i modify 5.7.25 true to false the Response is ok!!!
[9 Jun 2021 13:21]
MySQL Verification Team
Hi, With 5.7.24 you don't have this problem? only with 5.7.25+ ? thanks
[10 Jun 2021 1:13]
chao gao
YES,5.7.24 is ok. This step is debug mysql5.7.32: 1.sql thread get the log_space_lock ...... 2.mysql_file_delete(key_file_binlog, log_info.log_file_name, MYF(0)) function delete the relay_log file ...... 3.sql thread release the log_space_lock io thread can't get the log_space_lock from 1 to 3 and can't response ack to master
[10 Jun 2021 1:15]
chao gao
5.7.24 is ok,this problem from 5.7.25 to newest
[10 Jun 2021 1:17]
chao gao
I suggest release lock first then delete the relay_log file
[21 Jun 2021 5:55]
MySQL Verification Team
Hi, I think this is not a bug and that this is done intentionally but I will verify this behavior and give a dev team a chance to properly explain what they changed and why, or fix it, if it is really a bug. Thanks for the report
[23 Jun 2021 6:10]
chao gao
Thanks for your reply and look forward to your final verification result.
[21 Aug 2021 1:44]
chao gao
&rli->log_space_lock is used to control the io/sql thread to update the relay_log_space_total,I think this lock holding range is too large, the holding lock can be divided into two stages, so as to avoid the useless lock when deleting the relaylog physical file: source code(binlog.cc): // rli->relay_log.purge_logs call MYSQL_BIN_LOG::purge_index_entry 1) mysql_mutex_lock(&rli->log_space_lock); rli->relay_log.purge_logs(to_purge_if_included, included, false/*need_lock_index=false*/, false/*need_update_threads=false*/, &rli->log_space_total, true); // Tell the I/O thread to take the relay_log_space_limit into account rli->ignore_log_space_limit= 0; mysql_mutex_unlock(&rli->log_space_lock); 2) int MYSQL_BIN_LOG::purge_index_entry(THD *thd, ulonglong *decrease_log_space, bool need_lock_index) { ...... mysql_file_delete(key_file_binlog, log_info.log_file_name, MYF(0)) ...... if (decrease_log_space) *decrease_log_space-= s.st_size; // relay_log_space_total ...... } suggest fix: 1) rli->relay_log.purge_logs(to_purge_if_included, included, false/*need_lock_index=false*/, false/*need_update_threads=false*/, &rli->log_space_total, true); // Tell the I/O thread to take the relay_log_space_limit into account mysql_mutex_lock(&rli->log_space_lock); rli->ignore_log_space_limit= 0; mysql_mutex_unlock(&rli->log_space_lock); 2) int MYSQL_BIN_LOG::purge_index_entry(THD *thd, ulonglong *decrease_log_space, bool need_lock_index) { ...... mysql_file_delete(key_file_binlog, log_info.log_file_name, MYF(0)) // Avoid holding locks,1G relaylog and this log in os cache delete will use about 100~200ms,if delete multiple relaylogs, this will be fatal in a semi-synchronous scenario ...... mysql_mutex_lock(&rli->log_space_lock); if (decrease_log_space) *decrease_log_space-= s.st_size; // only hold the lock in relay_log_space_total update mysql_mutex_unlock(&rli->log_space_lock); ...... }
[21 Aug 2021 1:48]
chao gao
I have signed the OCA, but I didn’t get a reply
[24 Aug 2021 1:50]
chao gao
Rotate Binlog will leads to Relaylog Rotate and purged,The reason of this problem is the long lock time during relaylog purged.
[15 Aug 2022 3:26]
shang canfang
I have encountered the same problem and hope to get an official reply.