Bug #99282 Document details of auto-position protocol
Submitted: 16 Apr 2020 18:34 Modified: 5 May 2020 9:40
Reporter: Sven Sandberg Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.7 OS:Any
Assigned to: CPU Architecture:Any

[16 Apr 2020 18:34] Sven Sandberg
The GTID auto-position protocol is documented at https://dev.mysql.com/doc/refman/5.7/en/replication-gtids-auto-positioning.html

Part of the description is: "The master responds by sending all transactions recorded in its binary log whose GTID is not included in the GTID set sent by the slave."

In this context, it would be good to add details about exactly how the master finds transactions.

The procedure is:

 1. Find the correct binlog file to start with.
    The master send thread does this by inspecting the Previous_gtids_log_event of all binary logs in reverse order of creation, stopping when it finds a binary log in which the Previous_gtids_log_event does not include any transaction that the slave is missing. Note that it only reads a bit of the header of each binary log file, so this is only slow if the slave is lagging by a *large number of binary log files*

 2. Send events.
    It iterates from the beginning of the file it found in step 1. It reads all events from the binary log, but it skips sending those that belong to a transaction having a GTID that is included in the GTID set sent by the slave. The larger the offset of the first transaction is, the longer will it take for the first event to arrive at the slave.

The same holds in 5.6, 5.7 and 8.0. This part of the manual exists only in 5.7 and 8.0, so suggest to add it there.

How to repeat:

Suggested fix:
[5 May 2020 9:40] Margaret Fisher
Posted by developer:
Thanks Sven! Added this in 8.0 and 5.7:

To do this, the master first identifies the appropriate binary log file to begin working with, by checking the Previous_gtids_log_event in the header of each of its binary log files, starting with the most recent. When the master finds the first Previous_gtids_log_event which contains no transactions that the slave is missing, it begins with that binary log file. This method is efficient and only takes a significant amount of time if the slave is behind the master by a large number of binary log files. The master then reads the transactions in that binary log file and subsequent files up to the current one, sending the transactions with GTIDs that the slave is missing, and skipping the transactions that were in the GTID set sent by the slave. The elapsed time until the slave receives the first missing transaction depends on its offset in the binary log file.