MySQL Bugs: #117795: relay-bin is not removed after 'flush log' in primary node under innodb replicaset

Bug #117795	relay-bin is not removed after 'flush log' in primary node under innodb replicaset
Submitted:	25 Mar 16:25	Modified:	31 Mar 13:24
Reporter:	Kelvin Yu	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server	Severity:	S3 (Non-critical)
Version:	8.0.40	OS:	Red Hat
Assigned to:	MySQL Verification Team	CPU Architecture:	Any

Description:
#Background:
1. We have two nodes (modpd11 and modpd12) under Innodb replicaset. Default modpd11 as Primary and modpd12 as Standby 
2. I tested 'failover','rejoinInstance','setPrimaryInstance' before. Everything fine.

#Problem
The old relay-bin is not removed after 'flush log' in modpd11. relay-bin file amount keeps increasing.

purge_relay_logs = 1 by default.

How to repeat:
I tried to repeat it in another lab but was still unsuccessful.
I will keep trying.

[client]
socket=/var/lib/mysql/mysql.sock
default-character-set = utf8mb4

[mysqldump]
quick
max_allowed_packet = 1G

[mysqld]
## datafile Path
datadir=/mysql/mysqld
socket=/var/lib/mysql/mysql.sock
mysqlx_socket=/var/lib/mysql/mysqlx.sock
log-error=/var/log/mysql/mysqld.log
pid-file=/var/run/mysqld/mysqld_3306.pid

## Slow Log
slow-query-log=1
slow-query-log-file= /var/log/mysql/mysqld-slow.log

## BinLog
log_bin=mysql-bin
binlog_expire_logs_seconds = 259200

## Relay log
relay-log=relay-bin
relay_log_recovery=on

## General
server-id=1
log_timestamps = SYSTEM
skip-external-locking
skip-name-resolve
long_query_time = 10
key_buffer_size = 2048M
max_allowed_packet = 1073741824
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
max_connections = 4096
wait_timeout=2880000
interactive_timeout = 2880000

character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
skip-character-set-client-handshake

## Innodb
#innodb_dedicated_server = on
innodb_buffer_pool_size = 48G
innodb_redo_log_capacity = 16G
innodb_flush_method = O_DIRECT_NO_FSYNC

## Security
block_encryption_mode=aes-256-cbc
tls_ciphersuites='TLS_AES_256_GCM_SHA384'
ssl_cipher='ECDHE-ECDSA-AES128-GCM-SHA256'
sql-mode='STRICT_ALL_TABLES,ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION'

## Replication
gtid_mode = on
enforce_gtid_consistency = on
relay_log_recovery = on
binlog_transaction_dependency_tracking = WRITESET

I tried to install a lab again.(all step in reference)
But the cause can not be replicated

The MySQL and MySQL shell versions I used in lab are identical to the production.

MySQL version 8.0.40
MySQL shell version 8.4.3

Hi,
I followed everything you wrote and I was not able to reproduce the problem using 8.0.41 server and 8.0.41 shell (nor with latest 9.x shell)

Anything I can do to create more logging for this case?

I also can not reproduce this case. But this case really happened in our production.

I found that the metadata is not clear in slave_relay_log_info in the primary node.