Bug #84710 group replication does not respect localhost IP
Submitted: 30 Jan 2017 8:45 Modified: 21 Nov 2017 9:41
Reporter: Giuseppe Maxia (OCA) Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S1 (Critical)
Version:5.7.17 OS:Any
Assigned to: CPU Architecture:Any

[30 Jan 2017 8:45] Giuseppe Maxia
Description:
Running group replication with three servers in the same host, as described in the manual will succeed only if $(hostname) resolves to 127.0.0.1

$ hostname
myserver

$ cat /etc/hosts
127.0.0.1	localhost
127.0.0.1	myserver

The options file contains:
loose-group_replication_local_address=127.0.0.1:14518
loose-group_replication_group_seeds=127.0.0.1:14518,127.0.0.1:14519,127.0.0.1:14520

Using the above /etc/hosts, installation will succeed.

When the hostname does not resolve to 127.0.0.1, I get an error.

$ cat /etc/hosts
127.0.0.1	localhost
192.168.0.161	myserver

The log file says:
2017-01-30T08:16:49.305408Z 14 [ERROR] Slave I/O for channel 'group_replication_recovery': error connecting to master 'rsandbox@myserver:14418' - retry-time: 60  retries: 1, Error_code: 2003

Although I was explicitly using 127.0.0.1, the server attempted connection with my hostname.

How to repeat:

1) associate your hostname with a non local address, using either DNS or /etc/hosts
2) install group replication with three servers in the same host, as described in http://datacharmer.blogspot.com/2017/01/mysql-group-replication-vs-multi-source.html
3) check the errors in the error log. (~/sandboxes/GR/node2/data/msandbox.err)
[31 Jan 2017 4:53] MySQL Verification Team
Hi,

Thanks for the bug report, verified as described

kind regards
Bogdan
[16 Nov 2017 11:02] Erlend Dahl
Duplicate of 

Bug#86859 Group Replication does not honor skip-name-resolve

which is "won't fix".

Posted by developer:

[27 Jul 2017 9:47] Nuno Carvalho 

Thank you for the bug report, I see that there are still doubts
about the relation between
performance_schema.replication_group_members and
group_replication_local_address and group_replication_group_seeds
options, so lets try to clear out this for good and add a dedicated
chapter to the manual.
 
 
Group Replication communication system
--------------------------------------
 
Group Replication plugin is a new replication plugin, which
relies on a group communication system to communicate and coordinate
work between group members.
Group Replication messaging service is XCom, that every MySQL
server connects to.
XCom has its own protocol and uses its own address, i.e.
hostname/ip:port for its communication.
 
XCom has 3 main configuration options:
  1) group_replication_local_address
     This is the XCom host:port address used for internal group
     communication, a second port that MySQL process binds to just
     for that purpose, which uses its own protocol.
     Other Group Replication members will contact this member
     through this host:port for all group communication.
     This is not the MySQL server SQL protocol host and port, the
     commonly known 3306 port.
  2) group_replication_group_seeds
     This is the list of XCom host:port address that a new wannabe
     member uses to establish the first contact, like the name says,
     is a seed. One single active address it is enough, though we
     recommend that all addresses should be included allowing the
     joining process to failover to other addresses if one it is not
     active.
     This list addresses is only used during the join procedure,
     once the new member joined, the membership is kept updated
     internally by XCom.
     Again, this is not the MySQL server SQL protocol host and port,
     the commonly known 3306 port.
  3) group_replication_bootstrap_group
     This options mandates XCom to create a new group instead of
     trying to connect to a existing one.
 
 
Group Replication SQL membership
--------------------------------
 
Group Replication plugin maintains a membership of the active
members, that it is the information displayed on the table
performance_schema.replication_group_members.
This table displays the MySQL server SQL host and port, that is, the
addresses to which we can connect SQL clients, the commonly known
3306 port.
This is the information that a DBA or a client/router/multiplexer
needs to know in order to contact a MySQL server.
 
Example:
SELECT * FROM performance_schema.replication_group_members;
CHANNEL_NAME    group_replication_applier
MEMBER_ID       e84e7fba-60a9-11e7-8894-0010e0734796
MEMBER_HOST     member1
MEMBER_PORT     3306
MEMBER_STATE    ONLINE
CHANNEL_NAME    group_replication_applier
MEMBER_ID       e84e7fba-60a9-11e7-8894-0010e0734797
MEMBER_HOST     member2
MEMBER_PORT     3306
MEMBER_STATE    ONLINE
 
The MEMBER_PORT value is fetched from --port option on each
server.
The MEMBER_HOST value is fetched from --hostname option on each
server.
Then on member join, each member will share this information with
the group, in order to have a complete membership with the SQL
address of each member.
 
Since the value from MEMBER_HOST it is the server hostname, the
server hostname should be a fully qualified name and resolvable
through DNS and/or /etc/hosts (or other local process).
There is nothing new here, this is already the case with
asynchronous replication, please see SHOW SLAVE HOSTS command.
https://dev.mysql.com/doc/refman/5.7/en/show-slave-hosts.html
 
If a DBA wants to display a different value on MEMBER_HOST, she/he
can specify it through --report-host on each server before joining
the group. The assigned value will be used directly and will not be
affected by --skip_name_resolve option.
These options exist since forever (I guess) on asynchronous
replication, to solve the same exact issue on SHOW SLAVE HOSTS.
So we did extend their functionality to also affect MEMBER_HOST and
MEMBER_PORT. This is like that since the beginning of GR.
This means that if the DBA does want to force an IP, she/he must
set --report-host=X.X.X.X
The same direct use can be done with --report-port and MEMBER_PORT
value on each server before joining the group.
 
The Group Replication recovery procedure it is based on asynchronous
replication, that is, when a member joins it does establish a
asynchronous replication connection to one of the existing members
in order to fetch the missing data from the group to the new member.
The address of the member that donates the data it is fetched from
performance_schema.replication_group_members table, since it will
use the SQL port to fetch that data, like a regular slave.
https://dev.mysql.com/doc/refman/5.7/en/group-replication-user-credentials.htm
l
 
 
Why does Group Replication uses two bind addresses?
---------------------------------------------------
 
Group Replication does use two bind addresses in order to allow
traffic split between SQL port and XCom port. Let's assume a
multi-home server with the following network interfaces:
  10.0.0.1
  192.168.0.1
MySQL SQL port might be listening on 192.168.0.1.
XCom port might be listening on 10.0.0.1
(group_replication_local_address = 10.0.0.1:33061)
Allowing different QoS rules on the different networks.
 
It is also the case, that group traffic, for security issues,
may have to be isolated from the network that apps traffic goes
through.
 
 
Summary
-------
 * group_replication_local_address and group_replication_group_seeds
   configures how the messaging service contacts and its contacted
   by other members and uses its own distributed protocol (not SQL).
 * performance_schema.replication_group_members gives the
   membership information with the SQL address of each member.
 * there is no relation at all between
   group_replication_local_address and group_replication_group_seeds
   options and performance_schema.replication_group_members values.
 
It is now evident that people are not aware of this difference, we
need to improve our documentation to make this clear.
I will change this bug to a documentation bug and
request documentation team a full chapter dedicated to this.

 
Best regards,
Nuno Carvalho
[21 Nov 2017 9:41] Giuseppe Maxia
Thanks for the explanation, Nuno.

Everyone, to make a long story short, the fix for this issue is to add this line to mysqld start options:

--report-host=127.0.0.1

This advice is also in the docs, but not in the page where the options file directives are listed (https://dev.mysql.com/doc/refman/5.7/en/group-replication-configuring-instances.html)
[21 Nov 2017 10:21] Nuno Carvalho
Hi Giuseppe,

The documentation will be updated with all this information.

Best regards,
Nuno Carvalho