Bug #104436 the logic of process_io_rotate not consistent to it's comment
Submitted: 28 Jul 2021 7:46 Modified: 31 Jul 2021 11:06
Reporter: Xiaodong Huang (OCA) Email Updates:
Status: Verified Impact on me:
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.6.51,5.7.35 OS:Any
Assigned to: CPU Architecture:Any

[28 Jul 2021 7:46] Xiaodong Huang
This is a sister bug for #72376 and #72720 - same problem, but we find different reason from the bugfix "Bug 23532304 - SECONDS_BEHIND_MASTER REPORTS IO_THREAD TIME WHEN USING MTS & ROTATING RELAY-LOG"

The relay log show that the fake Rotate_Log_Event has timestamp (210727 15:50:53) of when binlog IO thread has started. 
    DELIMITER /*!*/;
    # at 4
    #210727 15:50:53 server id 2  end_log_pos 120 CRC32 0xa05acb3d 	Start: binlog v 4, server v 5.6.28-cdb2016-debug-log created 210727 15:50:53
    # This Format_description_event appears in a relay log and was generated by the slave thread.
    # at 120
    #210727 15:50:53 server id 1  end_log_pos 0 CRC32 0x079e505c 	Rotate to master-bin.000002  pos: 4
    # at 168
    #210727 15:50:56 server id 1  end_log_pos 120 CRC32 0x1cbdaf48 	Start: binlog v 4, server v 5.6.28-cdb2016-debug-log created 210727 15:50:56
  But in the function "fake_rotate_event", we can see a comment shown as following. It has set the timestamp of the fake rotate event to 0 for distinguishing between real and fake Rotate events.
      'when' (the timestamp) is set to 0 so that slave could distinguish between
      real and fake Rotate events (if necessary)
    int4store(header, 0);
   The timestamp of the fake Rotate_Log_Event should show as unix timestamp(0) if everything went as our expection. But in above relay log content, the timestamp of fake event is "210727 15:50:53". So the timestamp may be changed in the replication node. We will use two steps to illustrate the cause of this problem
    Step1: In the function "process_io_rotate" has following comment in which the logic of code under the comment is'n consistent with the content of the comment. When replication receive a real rotate log event, it invoke  "process_io_rotate" in the "queue_event" and the mi->get_mi_description_event()->binlog_version is set to 3. 
      If mi_description_event is format <4, there is conversion in the
      relay log to the slave's format (4). And Rotate can mean upgrade or
      nothing. If upgrade, it's to 5.0 or newer, so we will get a Format_desc, so
      no need to reset mi_description_event now. And if it's nothing (same
      master version as before), no need (still using the slave's format).
    Format_description_log_event *old_fdle= mi->get_mi_description_event();
    if (old_fdle->binlog_version >= 4)
      assert(old_fdle->common_footer->checksum_alg ==
      Format_description_log_event *new_fdle= new
  step2: Then the replication rotate a new relay log file in which the slave will receive a fake Rotate_Log_Event from source and the the mi->get_mi_description_event()->binlog_version is 3. So the replication will invoke "queue_old_event" in the "queue_event". In this porcess´╝îit will change the timestamp of the fake Rotate_Log_Event and the stack is shown as following.
      Log_event::write_header // change the timestamp of fake rotate log event
  Due to receive the fake rotate new file from source, the replication rotate a new relay log file. it will receive a Format_Description_Event from next new binlog log file in the source, then the mi->get_mi_description_event()->binlog_version is reset to 4. As a result, the binlog_version is changed from 4 to 3 only affect the Rotate_Log_Event.
  It is known to everyone that the high version should compatible with the low binlog_version in software development. So it may be a bug that the logic of code snippet in "process_io_rotate" downgrade the binlog version from 4 to 3 here.
  In mysql-8.0.26, the related logic above has deleted and the version don't exist this problem. In mysql-5.7.35 and mysql-5.6.51 still exist this problem.

How to repeat:
Check the logic of "process_io_rotate"

Suggested fix:
The "process_io_rotate" code logic should consistent to the front comment.
[31 Jul 2021 11:06] MySQL Verification Team
Hello Xiaodong,

Thank you for the report and feedback.