Bug #92526 MAKE GR REPLICATION CHANNELS ROTATE RELAY LOG ON FLUSH LOGS
Submitted: 21 Sep 2018 10:29 Modified: 26 Mar 11:38
Reporter: João Gramacho Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:8.0.14 OS:Any
Assigned to: CPU Architecture:Any

[21 Sep 2018 10:29] João Gramacho
Description:
The flush_relay_logs_cmd() contains the following restriction coded:

      /*
        Disallow flush on Group Replication applier channel to avoid
        split transactions among relay log files due to DBA action.
      */
      if (channel_map.is_group_replication_channel_name(lex->mi.channel,
                                                        true)) {
        if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL ||
            thd->system_thread == SYSTEM_THREAD_SLAVE_WORKER) {
          /*
            Log warning on SQL or worker threads.
          */
          LogErr(WARNING_LEVEL, ER_RPL_SLAVE_FLUSH_RELAY_LOGS_NOT_ALLOWED,
                 lex->mi.channel);
        } else {
          /*
            Return error on client sessions.
          */
          error = true;
          my_error(ER_SLAVE_CHANNEL_OPERATION_NOT_ALLOWED, MYF(0),
                   "FLUSH RELAY LOGS", lex->mi.channel);
        }
      } else
        error = flush_relay_logs(mi);

This restriction make GR MTS gaps recovery process simpler. As there are no split transactions in relay log files and GR requires GTID_MODE=ON, the server does not need to recover from MTS gaps and can just point to the beginning of the first available relay log file.

At mts_recovery_groups() we have:

  /*
    Parallel applier recovery is based on master log name and
    position, on Group Replication we have several masters what
    makes impossible to recover parallel applier from that information.
    Since we always have GTID_MODE=ON on Group Replication, we can
    ignore the positions completely, seek the current relay log to the
    beginning and start from there. Already applied transactions will be
    skipped due to GTIDs auto skip feature and applier will resume from
    the last applied transaction.
  */
  if (channel_map.is_group_replication_channel_name(rli->get_channel(), true)) {
    rli->recovery_parallel_workers = 0;
    rli->mts_recovery_group_cnt = 0;
    rli->set_group_relay_log_pos(BIN_LOG_HEADER_SIZE);
    DBUG_RETURN(0);
  }

However, binlog encryption feature shall rotate the relay logs of existing replication channels when enabling/disabling binlog_encryption option or when rotating the binlog encryption master key.

It would be better for WL#10957 and WL#12080 if GR channels allow the relay log rotation even in the middle of a transaction.

How to repeat:
See the flush_relay_logs_cmd() function code.

Suggested fix:
Create a new function to find the position of the first GTID event in a relay log file, and point the GR applier position to it instead of pointing to the beginning of the file.

At mts_recovery_groups(), change the logic to point to the first GTID event of the first relay log file.
[26 Mar 9:15] Margaret Fisher
Posted by developer:
 
Changelog entry added for MySQL 8.0.16:

Previously, relay logs could not be rotated manually for the Group Replication group_replication_applier channel using the FLUSH RELAY LOGS statement. Due to this restriction, when encryption was enabled for binary log files and relay log files (binlog_encryption=ON), as available from MySQL 8.0.14, the relay log file in use on that channel could not be rotated immediately if encryption was disabled again. The restriction had a similar impact on binary log master key rotation, as available from MySQL 8.0.16. The restriction has now been removed, and the FLUSH RELAY LOGS statement and corresponding internal requests now operate on the group_replication_applier channel as for any other channel, with the exception that if the request is received while a transaction is being applied, the request is performed after the transaction ends. The requester must wait while the transaction is completed and the rotation takes place. This behavior prevents transactions from being split, which is not permitted for Group Replication.
[26 Mar 11:38] Margaret Fisher
Posted by developer:
 
Restrictions removed in:
https://dev.mysql.com/doc/refman/8.0/en/replication-binlog-encryption.html 
https://dev.mysql.com/doc/refman/8.0/en/channels-commands-single-channel.html (noted new behavior)
https://dev.mysql.com/doc/refman/8.0/en/channels-with-prev-replication.html
https://dev.mysql.com/doc/refman/8.0/en/flush.html
[30 Apr 16:29] Jean-François Gagné
Related: Bug#89142.