Bug #84674 Having an unresolvable hostname in group_repl should not block group replication
Submitted: 25 Jan 2017 22:05 Modified: 16 Feb 2017 10:59
Reporter: Kenny Gryp Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S3 (Non-critical)
Version:5.7.latest OS:Any
Assigned to: Tiago Jorge CPU Architecture:Any

[25 Jan 2017 22:05] Kenny Gryp
Description:
Having an unresolvable hostname in group_repl should not block group replication from starting.

How to repeat:
---
mysql> set global group_replication_group_seeds='gr-1:24901,gr-2:24901,gr-3:24901,invalid_hostname:24901';
Query OK, 0 rows affected (0.00 sec)

mysql> start group_replication;
2017-01-24T19:22:03.173670Z 5 [Note] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2017-01-24T19:22:03.173775Z 5 [Note] Plugin group_replication reported: '[GCS] Added automatically IP ranges 10.0.2.15/24,127.0.0.1/8,192.168.56.3/24 to the whitelist'
2017-01-24T19:22:03.202195Z 5 [ERROR] Plugin group_replication reported: '[GCS] Peer address "invalid_hostname:24901" is not valid.'
2017-01-24T19:22:03.202238Z 5 [ERROR] Plugin group_replication reported: 'Unable to initialize the group communication engine'
2017-01-24T19:22:03.202243Z 5 [ERROR] Plugin group_replication reported: 'Error on group communication engine initialization'
2017-01-24T19:22:03.202289Z 5 [Note] Plugin group_replication reported: 'Requesting to leave the group despite of not being a member'
2017-01-24T19:22:03.202293Z 5 [ERROR] Plugin group_replication reported: 'Error calling group communication interfaces while trying to leave the group'
ERROR 3096 (HY000): The START GROUP_REPLICATION command failed as there was an error when initializing the group communication layer.
---

In this case, the hostname gr-3 is not resolvable, this blocks the entire group replication from starting.

As the group_replication_group_seeds setting is only used to find a single node in order to connect to the cluster, it's not required that they are all active GR nodes, or that they contain all GR nodes.

Here's the behavior when using an IP address that is not reachable (I was in a flight withtout internet during this):
---
mysql> set global group_replication_group_seeds='gr-1:24901,gr-2:24901,gr-3:24901,123.145.167.189:24901'
    -> ;
Query OK, 0 rows affected (0.00 sec)

mysql> start group_replication;
2017-01-24T19:25:41.406536Z 5 [Note] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2017-01-24T19:25:41.406745Z 5 [Note] Plugin group_replication reported: '[GCS] Added automatically IP ranges 10.0.2.15/24,127.0.0.1/8,192.168.56.3/24 to the whitelist'
2017-01-24T19:25:41.407081Z 5 [Note] Plugin group_replication reported: '[GCS] Translated 'gr-2' to 192.168.56.3'
2017-01-24T19:25:41.407218Z 5 [Note] Plugin group_replication reported: '[GCS] SSL was not enabled'
2017-01-24T19:25:41.407274Z 5 [Note] Plugin group_replication reported: 'Initialized group communication with configuration: g
...
---

that works!

Or let's use 255.255.255.255
---
mysql> set global group_replication_group_seeds='gr-1:24901,gr-2:24901,gr-3:24901,255.255.255.255:24901'
    -> ;
Query OK, 0 rows affected (0.00 sec)

mysql> start group_replication;
2017-01-24T19:24:15.598905Z 5 [Note] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2017-01-24T19:24:15.599073Z 5 [Note] Plugin group_replication reported: '[GCS] Added automatically IP ranges 10.0.2.15/24,127.0.0.1/8,192.168.56.3/24 to the whitelist'
2017-01-24T19:24:15.599195Z 5 [Note] Plugin group_replication reported: '[GCS] Translated 'gr-2' to 192.168.56.3'
2017-01-24T19:24:15.599281Z 5 [Note] Plugin group_replication reported: '[GCS] SSL was not enabled'
2017-01-24T19:24:15.599294Z 5 [Note] Plugin group_replication reported: 'Initialized group communication with configuration: group_replication_group_name:
....
---

That works too!

Given these 3 scenarios, why would a hostname need to block group replication to start?

Suggested fix:

I prefer a different behavior:

1. return a note/warning saying that gr-3 could not resolve
2. continue with starting the group replication, as is with a not reachable host or invalid ip addres.
[27 Jan 2017 9:26] Nuno Carvalho
Hi Kenny,

Thank you for reporting this issue, we are looking into it.

Best regards,
Nuno Carvalho
[16 Feb 2017 10:59] David Moss
Posted by developer:
 
Thanks for your feedback, the following was added to the 5.7.18 and 8.0.1 change logs:
Using an unresolvable host name in group_replication_group_seeds caused START GROUP_REPLICATION to fail. The fix ensures that host names in group_replication_group_seeds are validated when starting Group Replication and the list must contain at least one valid address. Invalid addresses are ignored.