| Bug #103943 | Semi-Replication Rotate And Purge Relaylog Leads To Increased Response | ||
|---|---|---|---|
| Submitted: | 8 Jun 2021 11:42 | Modified: | 24 Aug 2021 1:50 |
| Reporter: | chao gao | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
| Version: | 5.7.33 | OS: | CentOS (7.6) |
| Assigned to: | CPU Architecture: | x86 (64) | |
[8 Jun 2021 11:49]
chao gao
maybe this reason? Replication: The value returned by a SHOW SLAVE STATUS statement for the total combined size of all existing relay log files (Relay_Log_Space) could become much larger than the actual disk space used by the relay log files. The I/O thread did not lock the variable while it updated the value, so the SQL thread could automatically delete a relay log file and write a reduced value before the I/O thread finished updating the value. The I/O thread then wrote its original size calculation, ignoring the SQL thread's update and so adding back the space for the deleted file. The Relay_Log_Space value is now locked during updates to prevent concurrent updates and ensure an accurate calculation. (Bug #26997096, Bug #87832) 5.7.24:rli->relay_log.harvest_bytes_written(&rli->log_space_total); 5.7.25:rli->relay_log.harvest_bytes_written(rli, true/*need_log_space_lock=true*/); when i modify 5.7.25 true to false the Response is ok!!!
[9 Jun 2021 13:21]
MySQL Verification Team
Hi, With 5.7.24 you don't have this problem? only with 5.7.25+ ? thanks
[10 Jun 2021 1:13]
chao gao
YES,5.7.24 is ok. This step is debug mysql5.7.32: 1.sql thread get the log_space_lock ...... 2.mysql_file_delete(key_file_binlog, log_info.log_file_name, MYF(0)) function delete the relay_log file ...... 3.sql thread release the log_space_lock io thread can't get the log_space_lock from 1 to 3 and can't response ack to master
[10 Jun 2021 1:15]
chao gao
5.7.24 is ok,this problem from 5.7.25 to newest
[10 Jun 2021 1:17]
chao gao
I suggest release lock first then delete the relay_log file
[21 Jun 2021 5:55]
MySQL Verification Team
Hi, I think this is not a bug and that this is done intentionally but I will verify this behavior and give a dev team a chance to properly explain what they changed and why, or fix it, if it is really a bug. Thanks for the report
[23 Jun 2021 6:10]
chao gao
Thanks for your reply and look forward to your final verification result.
[21 Aug 2021 1:44]
chao gao
&rli->log_space_lock is used to control the io/sql thread to update the relay_log_space_total,I think this lock holding range is too large, the holding lock can be divided into two stages, so as to avoid the useless lock when deleting the relaylog physical file:
source code(binlog.cc):
// rli->relay_log.purge_logs call MYSQL_BIN_LOG::purge_index_entry
1)
mysql_mutex_lock(&rli->log_space_lock);
rli->relay_log.purge_logs(to_purge_if_included, included,
false/*need_lock_index=false*/,
false/*need_update_threads=false*/,
&rli->log_space_total, true);
// Tell the I/O thread to take the relay_log_space_limit into account
rli->ignore_log_space_limit= 0;
mysql_mutex_unlock(&rli->log_space_lock);
2)
int MYSQL_BIN_LOG::purge_index_entry(THD *thd, ulonglong *decrease_log_space, bool need_lock_index)
{
......
mysql_file_delete(key_file_binlog, log_info.log_file_name, MYF(0))
......
if (decrease_log_space)
*decrease_log_space-= s.st_size; // relay_log_space_total
......
}
suggest fix:
1)
rli->relay_log.purge_logs(to_purge_if_included, included,
false/*need_lock_index=false*/,
false/*need_update_threads=false*/,
&rli->log_space_total, true);
// Tell the I/O thread to take the relay_log_space_limit into account
mysql_mutex_lock(&rli->log_space_lock);
rli->ignore_log_space_limit= 0;
mysql_mutex_unlock(&rli->log_space_lock);
2)
int MYSQL_BIN_LOG::purge_index_entry(THD *thd, ulonglong *decrease_log_space, bool need_lock_index)
{
......
mysql_file_delete(key_file_binlog, log_info.log_file_name, MYF(0)) // Avoid holding locks,1G relaylog and this log in os cache delete will use about 100~200ms,if delete multiple relaylogs, this will be fatal in a semi-synchronous scenario
......
mysql_mutex_lock(&rli->log_space_lock);
if (decrease_log_space)
*decrease_log_space-= s.st_size; // only hold the lock in relay_log_space_total update
mysql_mutex_unlock(&rli->log_space_lock);
......
}
[21 Aug 2021 1:48]
chao gao
I have signed the OCA, but I didn’t get a reply
[24 Aug 2021 1:50]
chao gao
Rotate Binlog will leads to Relaylog Rotate and purged,The reason of this problem is the long lock time during relaylog purged.
[15 Aug 2022 3:26]
shang canfang
I have encountered the same problem and hope to get an official reply.
[7 Oct 0:56]
MySQL Verification Team
Bug #119091 is marked as duplicate of this one

Description: replication type:M--->S(semi-sync) version:5.7.25-5.7.33 max_binlog_size:1G slave_parallel_type:logical_clock relay_log_purge:ON long_query_time:0.1 repl_semi_sync_master_wait_point:after_sync repl_semi_sync_master_wait_for_slave_count:1 repl_semi_sync_master_wait_no_slave:on repl_semi_sync_master_timeout=999999999 repl_semi_sync_master_enabled=on repl_semi_sync_slave_enabled=on wait_time(master wait slave ack ) will be greater than 200ms when the master rotate binlog. How to repeat: 1.modify code file:semisync_master.cc wait_time = getWaitTime(start_ts); sql_print_information("ack time is %llu", wait_time); /*add by gaochao*/ make && make install 2.run sysbench until master rotate binlog,Observe the sysbench response and error log、slow log