Bug #115022 Using mysqldump leads to inconsistent GTIDs between source and replica.
Submitted: 16 May 2024 2:50 Modified: 16 May 2024 9:53
Reporter: Jinghua Lin Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: mysqldump Command-line Client Severity:S4 (Feature request)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[16 May 2024 2:50] Jinghua Lin
Description:
In general, during backups, it is possible to execute them on either the source or the replica. Therefore, I believe backup tools should refrain from executing statements that write to the binlog to avoid generating additional GTIDs on the replica.

However, I've noticed that mysqldump, when using certain parameters, executes "flush tables" instead of "flush local tables", resulting in the creation of a new GTID. If this backup is performed on the replica, it could lead to a discrepancy in GTIDs between the source and the replica. Only when using the --master-data parameter "flush local tables" will be executed. Relevant code:

static int do_flush_tables_read_lock(MYSQL *mysql_con) {
  /*
    We do first a FLUSH TABLES. If a long update is running, the FLUSH TABLES
    will wait but will not stall the whole mysqld, and when the long update is
    done the FLUSH TABLES WITH READ LOCK will start and succeed quickly. So,
    FLUSH TABLES is to lower the probability of a stage where both mysqldump
    and most client connections are stalled. Of course, if a second long
    update starts between the two FLUSHes, we have that bad stall.
  */
  return (mysql_query_with_error_report(
              mysql_con, nullptr,
              ((opt_master_data != 0) ? "FLUSH /*!40101 LOCAL */ TABLES"
                                      : "FLUSH TABLES")) ||
          mysql_query_with_error_report(mysql_con, nullptr,
                                        "FLUSH TABLES WITH READ LOCK"));
}

......

  if ((opt_lock_all_tables || opt_master_data ||
       (opt_single_transaction &&
        (flush_logs || server_with_gtids_and_opt_purge_not_off))) &&
      do_flush_tables_read_lock(mysql))
    goto err;

I don't quite understand why there would be two different scenarios; it seems like using "flush local tables" directly is the best choice.

How to repeat:
1. Performing backups on the replica, like: mysqldump --single-transaction --set-gtid-purged=on --all-databases > backup.sql

2. Executing "show master status" and comparing the GTID differences between the source and replica.

3. Using mysqlbinlog to parse the additional GTID reveals that a "flush tables" statement was executed.
[16 May 2024 9:53] MySQL Verification Team
Hi Mr. Lin,

Thank you for your report.

We consider your report a fine feature request for the environments with replication.

verified as a feature request for the versions 8.0 and higher.