Bug #112727 MySQL may couldn't solve the error when flush thread caches to binary logs
Submitted: 15 Oct 2023 4:17 Modified: 19 Oct 2023 11:10
Reporter: Yawei Sun Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:MySQL8.0.22, 8.0.23 OS:Any
Assigned to: MySQL Verification Team CPU Architecture:Any

[15 Oct 2023 4:17] Yawei Sun
Description:
In loop of the function MYSQL_BIN_LOG::process_flush_stage_queue (src/binlog.cc), only when flush_error == 1 that the value of flush_error will be change, and the result is returned by functioun flush_thread_caches(), the function return error code on error, zero if no error.
So there is a scene that in the beginning loops of the function MYSQL_BIN_LOG::process_flush_stage_queue, we successfully flush caches, so flush_error will be set to 0, however it did not flush successfully in the next loop, because flush_error == 0 at this time, so the error this time will not be recorded and nobody know there is exist an error when fulsh caches.

The related function calling relationships are as follows:

MYSQL_BIN_LOG::ordered_commit
	--> MYSQL_BIN_LOG::process_flush_stage_queue
		--> MYSQL_BIN_LOG::flush_thread_cache
			--> binlog_cache_mngr::flush

The related code in MYSQL_BIN_LOG::process_flush_stage_queue are as follows:
  for (THD *head = first_seen; head; head = head->next_to_commit) {
    Thd_backup_and_restore switch_thd(current_thd, head);
    std::pair<int, my_off_t> result = flush_thread_caches(head);
    total_bytes += result.second;
    if (flush_error == 1) flush_error = result.first;                                                                                                                                                                                 
#ifndef NDEBUG
    no_flushes++;
#endif
  }

How to repeat:
It's a code bug now i haven't create a scene to repeat
[19 Oct 2023 11:10] MySQL Verification Team
HI Mr. Sun,

Thank you for your bug report.

However, your analysis is wrong.

Simply, if you analyse the entire function, you will notice that flush_error can be changed ONLY once regardless of the number of passes in the loop.

Hence, this is not a bug.