Description:
Even with skip-name-resolve enabled, the hosts used in a Group Replication setup must be resolvable.
How to repeat:
1. Setup two hosts. The IP addresses are:
shell$ $ ip -o -4 address show dev enp0s8 | awk '{print $4;}'
192.168.56.101/24
shell$ ip -o -4 address show dev enp0s8 | awk '{print $4;}'
192.168.56.104/24
The host names (gr101 and gr104 respectively) are not resolvable (neither by DNS nor through /etc/hosts).
2. Group replication configuration on 192.168.56.101 (similar on the other hosts):
[mysqld]
# Paths
basedir = /gr/mysql
datadir = /gr/data
socket = /gr/run/mysql.sock
pid_file = /gr/run/mysql.pid
log_bin
# Plugins
plugin-load = group_replication.so
# General settings
disabled_storage_engines = MyISAM,BLACKHOLE,FEDERATED,ARCHIVE
skip_name_resolve
# Binary Log and Replication
server_id = 57183308
gtid_mode = ON
enforce_gtid_consistency = ON
log_slave_updates = ON
binlog_format = ROW
master_info_repository = TABLE
relay_log_info_repository = TABLE
binlog_format = ROW
binlog_format = ROW
transaction_write_set_extraction = XXHASH64
binlog_checksum = NONE
# Group Replication
group_replication = FORCE_PLUS_PERMANENT
group_replication_start_on_boot = OFF
group_replication_group_name = d26c63f4-bffa-11e6-83b4-08002715584a
group_replication_local_address = 192.168.56.101:6606
group_replication_group_seeds = 192.168.56.103:6606,192.168.56.104:6606
3. Start Group Replication on 192.168.56.101 using https://support.oracle.com/rs?type=doc&id=2212994.1
with the changes requried to reflect the different host related settings.
192.168.56.101> RESET MASTER;
Query OK, 0 rows affected (5.65 sec)
192.168.56.101> CHANGE MASTER TO MASTER_USER='rpl_user', MASTER_PASSWORD='rpl_pass' FOR CHANNEL 'group_replication_recovery';
Query OK, 0 rows affected, 1 warning (0.09 sec)
192.168.56.101> SET GLOBAL group_replication_bootstrap_group = ON;
Query OK, 0 rows affected (0.00 sec)
192.168.56.101> START GROUP_REPLICATION;
Query OK, 0 rows affected (3.15 sec)
192.168.56.101> SET GLOBAL group_replication_bootstrap_group = OFF;
Query OK, 0 rows affected (0.00 sec)
192.168.56.101> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 9e91a79f-5c7e-11e7-bbe6-080027611030 | gr101 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
1 row in set (0.01 sec)
192.168.56.101> CREATE USER rpl_user@'192.168.56.%' IDENTIFIED BY 'rpl_pass';
Query OK, 0 rows affected (0.00 sec)
192.168.56.101> GRANT REPLICATION SLAVE ON *.* TO rpl_user@'192.168.56.%';
Query OK, 0 rows affected (0.01 sec)
4. Add 192.168.56.104:
192.168.56.104> RESET MASTER;
Query OK, 0 rows affected (0.17 sec)
192.168.56.104> CHANGE MASTER TO MASTER_USER='rpl_user', MASTER_PASSWORD='rpl_pass' FOR CHANNEL 'group_replication_recovery';
Query OK, 0 rows affected, 1 warning (0.16 sec)
192.168.56.104> START GROUP_REPLICATION;
Query OK, 0 rows affected (8.24 sec)
This fails as:
2017-06-29T04:13:54.944669Z 0 [Note] Plugin group_replication reported: 'Starting group replication recovery with view_id 14987095752316130:2'
2017-06-29T04:13:54.945927Z 12 [Note] Plugin group_replication reported: 'Establishing group recovery connection with a possible donor. Attempt 1/10'
2017-06-29T04:13:54.988689Z 12 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='g
2017-06-29T04:13:55.022005Z 12 [Note] Plugin group_replication reported: 'Establishing connection to a group replication recovery donor cc85ac80-5c80-11e7-8e14-080027611030 at gr101 port: 3306.'
2017-06-29T04:13:55.022322Z 14 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for STA
2017-06-29T04:13:55.026604Z 15 [Note] Slave SQL thread for channel 'group_replication_recovery' initialized, starting replication in log 'FIRST' at position 0, relay log './gr104-relay-bin-group_replication_recovery.000001' position: 4
2017-06-29T04:13:55.817051Z 14 [ERROR] Slave I/O for channel 'group_replication_recovery': error connecting to master 'rpl_user@gr101:3306' - retry-time: 60 retries: 1, Error_code: 2005
2017-06-29T04:13:55.817083Z 14 [Note] Slave I/O thread for channel 'group_replication_recovery' killed while connecting to master
2017-06-29T04:13:55.817089Z 14 [Note] Slave I/O thread exiting for channel 'group_replication_recovery', read up to log 'FIRST', position 4
2017-06-29T04:13:55.817299Z 12 [ERROR] Plugin group_replication reported: 'There was an error when connecting to the donor server. Check group replication recovery's connection credentials.'
2017-06-29T04:13:55.817503Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 2/10'
On 192.168.56.101:
192.168.56.101> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | cc85ac80-5c80-11e7-8e14-080027611030 | gr101 | 3306 | ONLINE |
| group_replication_applier | fef1eddc-5c80-11e7-9bea-080027d1eed7 | gr104 | 3306 | RECOVERING |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
2 rows in set (0.00 sec)
Note how both the error message and performance_schema.replication_group_members refers to the hostname. This is not expected when hostname resolution has been turned off.
5. Make the host names resolvable, and 192.168.56.104 comes online.
Suggested fix:
Honour skip-name-resolve and make Group Replication behave as MySQL in general.