Bug #108104 LOCK_status affect transaction performance when do a "show status" command
Submitted: 10 Aug 2022 3:06 Modified: 10 Aug 2022 12:39
Reporter: shang canfang (OCA) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Replication Severity:S5 (Performance)
Version:MySQL5.7.26 OS:Any
Assigned to: CPU Architecture:Any

[10 Aug 2022 3:06] shang canfang
Description:
in order_commit func, after flush stage, here will update status, need acquire LOCK_status, but show status command also hold this lock. Exporter will collect information by "show status" command periodically. so will cause performance jitter when have thousands of thds.

publish_coordinates_for_global_status
{
  mysql_mutex_assert_owner(&LOCK_log);

  mysql_mutex_lock(&LOCK_status);
  strcpy(binlog_global_snapshot_file, log_file_name);
  binlog_global_snapshot_position=
      my_b_inited(&log_file) ? my_b_tell(&log_file) : 0;
  mysql_mutex_unlock(&LOCK_status);
}

show status will scan all thd
void calc_sum_of_all_status(STATUS_VAR *to)
{
  DBUG_ENTER("calc_sum_of_all_status");
  mysql_mutex_assert_owner(&LOCK_status);
  /* Get global values as base. */
  *to= global_status_var;
  Add_status add_status(to);
  Global_THD_manager::get_instance()->do_for_all_thd_copy(&add_status);
  DBUG_VOID_RETURN;
}

How to repeat:

create thousands of thds, execute "show status" command periodically when run insert/update test 

Suggested fix:
I think status can be delayed, in commit path, not use lock, use try_lock

void MYSQL_BIN_LOG::publish_coordinates_for_global_status(void) const
{
  mysql_mutex_assert_owner(&LOCK_log);

  if (!mysql_mutex_trylock(&LOCK_status)) {
	  strcpy(binlog_global_snapshot_file, log_file_name);
	  binlog_global_snapshot_position=
		  my_b_inited(&log_file) ? my_b_tell(&log_file) : 0;
	  mysql_mutex_unlock(&LOCK_status);
  }
}
[10 Aug 2022 8:15] shang canfang
Sorry, correct it,the command is "show global status", not "show status"
[10 Aug 2022 12:39] MySQL Verification Team
Hi Mr. ye,

Thank you for your bug report.

However, this is not a bug.

If our code needs to change status, it must first acquire LOCK_status. That is due to the multi-thread access to the same part of the code, which means that protection is mandatory. Since this is about ever-changing binary log coordinates, then `trylock` is simply not good enough.

Also, you analyse code from an ancient release, which is very , very old. Better look at 5.7.39 and 8.0.30.

Not a bug.