Description:
Using asynchronous replication to connect a cluster using PXC (PXC 8.0.20) with a MySQL Group Replication cluster (8.0.22).
Where PXC is the primary and GR is the replica.
If the Primary node fails, when the node comes back and require a clone operation, clone operation succeed but node is not able to join back the Group Replication cluster.
How to repeat:
Scenario
- PXC 8.0.20 cluster in DC1 3 nodes
- Group replication cluster DC2 4 nodes MySQL 8.0.22
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 38809616-e149-11ea-b0cc-080027d0603e | gr1 | 3306 | ONLINE | PRIMARY | 8.0.22 |
| group_replication_applier | 959cc074-e14b-11ea-b90c-080027d0603e | gr2 | 3306 | ONLINE | SECONDARY | 8.0.22 |
| group_replication_applier | a56b38ec-e14b-11ea-b9ce-080027d0603e | gr3 | 3306 | ONLINE | SECONDARY | 8.0.22 |
| group_replication_applier | b7039e3b-f744-11ea-854c-08002734ed50 | gr4 | 3306 | ONLINE | SECONDARY | 8.0.22 |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
Group replication cluster Primary 'gr1' connect as slave to one of the nodes from PXC cluster.
Node gr1 fails and Node gr2 become the new primary.
I restart gr1 and FORCE cloning with :
set global group_replication_clone_threshold = 1;
Then START group_replication;
Node start cloning from node gr3:
>SELECT * FROM performance_schema.clone_status;
+------+------+-------------+-------------------------+----------+----------+----------------+----------+---------------+-------------+-----------------+---------------+
| ID | PID | STATE | BEGIN_TIME | END_TIME | SOURCE | DESTINATION | ERROR_NO | ERROR_MESSAGE | BINLOG_FILE | BINLOG_POSITION | GTID_EXECUTED |
+------+------+-------------+-------------------------+----------+----------+----------------+----------+---------------+-------------+-----------------+---------------+
| 1 | 27 | In Progress | 2020-10-21 16:53:48.826 | NULL | gr3:3306 | LOCAL INSTANCE | 0 | | | 0 | |
+------+------+-------------+-------------------------+----------+----------+----------------+----------+---------------+-------------+-----------------+---------------+
Node complete cloning:
(root@localhost) [(none)]>SELECT * FROM performance_schema.clone_status\G
*************************** 1. row ***************************
ID: 1
PID: 0
STATE: Completed
BEGIN_TIME: 2020-10-21 16:53:48.826
END_TIME: 2020-10-21 16:59:40.747
SOURCE: gr3:3306
DESTINATION: LOCAL INSTANCE
ERROR_NO: 0
ERROR_MESSAGE:
BINLOG_FILE: binlog.000015
BINLOG_POSITION: 39239797
GTID_EXECUTED: 06779870-54cb-11ea-866d-08002734ed50:1-5,
7c695844-cb3f-11ea-b252-b35802f30c19:1-378,
8f98109a-12c3-11eb-8f47-339bdc860e71:1-12810,
a1e2f3a7-54cd-11ea-90b7-37299557067c:1-200118,
aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-125674,
be97dbde-a7c6-11ea-822b-08002734ed50:1-5306,
caf70b4b-8885-11ea-9c80-17253fa4edc0:1-109944,
e314189e-b0bf-11ea-adc3-cb4a792f4c60:1-64950,
f9e250ad-a7c7-11ea-8ee2-0a670dbc5818:1-133047
Node is ready to have Group replication started.
BUT:
2020-10-21T16:00:05.688835Z 0 [System] [MY-010931] [Server] /opt/mysql_templates/mysql-8P/bin/mysqld: ready for connections. Version: '8.0.22' socket: '/opt/mysql_instances/gr8/mysql.sock' port: 3306 MySQL Community Server - GPL.
2020-10-21T16:00:57.326476Z 12 [System] [MY-013587] [Repl] Plugin group_replication reported: 'Plugin 'group_replication' is starting.'
2020-10-21T16:00:57.329043Z 13 [System] [MY-011565] [Repl] Plugin group_replication reported: 'Setting super_read_only=ON.'
2020-10-21T16:00:57.432364Z 14 [ERROR] [MY-010544] [Repl] Failed to open the relay log '/opt/mysql_instances/gr8/logs/relay-group_replication_applier.000020' (relay_log_pos 39235826). <------------------| File is not in the receiver file system
2020-10-21T16:00:57.432484Z 14 [ERROR] [MY-011059] [Repl] Could not find target log file mentioned in relay log info in the index file '/opt/mysql_instances/gr8/logs/relay-group_replication_applier.index' during relay log initialization.
2020-10-21T16:00:57.435654Z 14 [ERROR] [MY-011533] [Repl] Plugin group_replication reported: 'Failed to setup the group replication applier thread.'
2020-10-21T16:00:57.436117Z 12 [ERROR] [MY-011669] [Repl] Plugin group_replication reported: 'Unable to initialize the Group Replication applier module.'
2020-10-21T16:00:57.436627Z 12 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is leaving a group without being on one.'
2020-10-21T16:00:57.438453Z 13 [System] [MY-011566] [Repl] Plugin group_replication reported: 'Setting super_read_only=OFF.'
Check mysql.slave_relay_log_info in NODE gr1:
(root@localhost) [(none)]>select * from mysql.slave_relay_log_info\G
*************************** 1. row ***************************
Number_of_lines: 12
Relay_log_name: /opt/mysql_instances/gr8/logs/relay-group_replication_applier.000020
Relay_log_pos: 39235826
Master_log_name:
Master_log_pos: 0
Sql_delay: 0
Number_of_workers: 4
Id: 1
Channel_name: group_replication_applier
Privilege_checks_username: NULL
Privilege_checks_hostname: NULL
Require_row_format: 1
Require_table_primary_key_check: STREAM
1 row in set (0.00 sec)
Which correctly is the same as in gr3.
But the files ARE NOT copy over from gr3.
Files in logs directory for Node gr1:
[root@gr1 gr8]# ll logs/
total 1536048
-rw-r----- 1 mysql mysql 171 Oct 21 11:59 binlog.000001
-rw-r----- 1 mysql mysql 152 Oct 21 12:00 binlog.000002
-rw-r----- 1 mysql mysql 88 Oct 21 12:00 binlog.index
-rw-r----- 1 mysql mysql 524288000 Oct 21 12:52 ib_logfile0
-rw-r----- 1 mysql mysql 524288000 Oct 21 11:59 ib_logfile1
-rw-r----- 1 mysql mysql 524288000 Oct 21 12:52 ib_logfile2
-rw-r----- 1 mysql mysql 171 Oct 21 11:39 relay-dc1_to_dc2.000001
-rw-r----- 1 mysql mysql 54 Oct 21 10:38 relay-dc1_to_dc2.index
-rw-r----- 1 mysql mysql 171 Oct 21 11:52 relay-group_replication_applier.000001
-rw-r----- 1 mysql mysql 217 Oct 21 11:53 relay-group_replication_applier.000002
-rw-r----- 1 mysql mysql 292 Oct 21 11:59 relay-group_replication_applier.000003
-rw-r----- 1 mysql mysql 171 Oct 21 12:00 relay-group_replication_applier.000004
-rw-r----- 1 mysql mysql 276 Oct 21 12:00 relay-group_replication_applier.index
-rw-r----- 1 mysql mysql 171 Oct 21 11:59 relay-group_replication_recovery.000001
-rw-r----- 1 mysql mysql 70 Oct 21 11:53 relay-group_replication_recovery.index
Files in logs directory for Node gr3:
[root@gr3 gr8]# ll logs/
total 2059292
-rw-r----- 1 mysql mysql 457323188 Oct 6 09:38 binlog.000013
-rw-r----- 1 mysql mysql 1966 Oct 20 12:48 binlog.000014
-rw-r----- 1 mysql mysql 39239797 Oct 21 11:53 binlog.000015
-rw-r----- 1 mysql mysql 132 Oct 20 12:49 binlog.index
-rw-r----- 1 mysql mysql 524288000 Oct 21 12:01 ib_logfile0
-rw-r----- 1 mysql mysql 524288000 Sep 17 11:49 ib_logfile1
-rw-r----- 1 mysql mysql 524288000 Oct 21 12:01 ib_logfile2
-rw-r----- 1 mysql mysql 152 Oct 21 08:43 relay-dc1_to_dc2.000001
-rw-r----- 1 mysql mysql 54 Oct 21 08:43 relay-dc1_to_dc2.index
-rw-r----- 1 mysql mysql 3048 Oct 21 09:13 relay-group_replication_applier.000019
-rw-r----- 1 mysql mysql 39235826 Oct 21 11:53 relay-group_replication_applier.000020
-rw-r----- 1 mysql mysql 138 Oct 21 09:13 relay-group_replication_applier.index
-rw-r----- 1 mysql mysql 218 Oct 21 09:14 relay-group_replication_recovery.000001
-rw-r----- 1 mysql mysql 273 Oct 21 09:14 relay-group_replication_recovery.000002
-rw-r----- 1 mysql mysql 140 Oct 21 09:14 relay-group_replication_recovery.index
Suggested fix:
This is not the fix but the workaround to have the node join:
In order to have the Node gr1 start correctly and then join ght GR cluster I had to copy over the relay log and clean up any reference to the other channel 'dc1_to_dc2' from mysql.
delete from mysql.slave_master_info where Channel_name 'dc1_to_dc2';
delete from mysql.slave_relay_log_info where Channel_name = 'dc1_to_dc2';
And of course from the file system in logs.
Then restart.