Bug #94004 Cannot setup replication without ipv6 since 8.0.14
Submitted: 22 Jan 2019 15:37 Modified: 28 Jan 2019 10:21
Reporter: Guillaume Bienkowski Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S2 (Serious)
Version:8.0.14 OS:Debian (Stretch)
Assigned to: CPU Architecture:x86 (amd64)

[22 Jan 2019 15:37] Guillaume Bienkowski
Description:
Since Mysql server 8.0.14, we cannot setup a replication group using mysql-shell (and probably by using mysql client).

The server fails with these messages (note the "cannot use port xxx" and the "adding ipv6 localhost to whitelist" even though no ipv6 is available):

2019-01-22T15:14:18.210915Z 17 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.
2019-01-22T15:14:18.229032Z 17 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the whitelist. It is mandatory that it is added.'
2019-01-22T15:14:18.229157Z 17 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv6 localhost address to the whitelist. It is mandatory that it is added.'
2019-01-22T15:14:18.245684Z 19 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2019-01-22T15:14:18.308477Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce tcp port 33061. Port already in use?'
2019-01-22T15:14:18.308685Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.'
2019-01-22T15:14:18.308771Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061'
2019-01-22T15:15:18.260034Z 17 [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group'
2019-01-22T15:15:18.260198Z 17 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is leaving a group without being on one.'

There is no daemon listening on port 33061.

Mitigations tried:

- specifying the bind_address to 0.0.0.0 => doesn't work
- remove the "ipv6.disable=1" in grub cmdline and reboot => works! Seems to allow for the mysql daemon to bind correctly and process further.

How to repeat:
Our use case is: 

- 3 nodes, Debian stretch up to date
- Kernel cmdline has 'ipv6.disable=1'
- provisioning by using a JS script with mysql-shell:

```javascript 

var name = 'mysqlcluster1',
    hosts = [
                "10.31.3.60:3306",
                "10.31.3.61:3306",
                "10.31.3.62:3306",
            ],
    cluster,
    changed = false;

try {
    cluster = dba.getCluster(name);
}
catch(ex) {
    cluster = dba.createCluster(name);
    changed = true;
}

var status = cluster.describe();
// Get cluster topology (it is not an array but simply an enumerable)
var topology = status.defaultReplicaSet.topology;

hosts.forEach(function(host) {
    var isMember = false;
    for (var i = 0, n = topology.length; i < n; i++) {
        if (topology[i].address == host) {
            isMember = true;
            break;
        }
    }
    
    if (!isMember) {
        var instance = 'xxx:yyyy@' + host;
        cluster.addInstance(instance);
        changed = true;
    }
});

var out = {
    changed: changed
};
// We print a fake line in case of color change in console that can fail JSON parsing
println('Output');
print(JSON.stringify(out, null, 0));

```

Suggested fix:
Figure out why the bind fails when IPV6 is deactivated.
[22 Jan 2019 15:41] Guillaume Bienkowski
Also noteworthy: this exact same script works on 8.0.13 on the same hosts.

We had to manually download the debian archives from your web repo, because they are not available anymore in Packages.gz.
[22 Jan 2019 23:26] Frederic Descamps
I've tested on CentOS 7.5:

edit /etc/default/grub and edit line with ipv6.disable=1 :

GRUB_CMDLINE_LINUX="xen_blkfront.sda_is_xvda=1 crashkernel=auto rd.lvm.lv=VolGroup/lv_root rd.lvm.lv=VolGroup/lv_swap rhgb quiet net.ifnames=0 biosdevname=0 cloud-init=disabled ipv6.disable=1"

then run: 

grub2-mkconfig -o /boot/grub2/grub.cfg

restart... 

Use the shell to create a group:

JS> cluster=dba.createCluster('sweden')
A new InnoDB cluster will be created on instance 'clusteradmin@mysql2:3306'.

Validating instance at mysql2:3306...

This instance reports its own address as mysql2

Instance configuration is suitable.
Creating InnoDB cluster 'sweden' on 'clusteradmin@mysql2:3306'...
Dba.createCluster: ERROR: Error starting cluster: 'mysql2:3306' - Query failed. MySQL Error (3092): ClassicSession.query: The server is not configured properly to be an active member of the group. Please see more details on error log.. Query: START group_replication: MySQL Error (3092): ClassicSession.query: The server is not configured properly to be an active member of the group. Please see more details on error log. (RuntimeError)

and in the error log:

2019-01-22T23:17:34.974112Z 12 [Warning] [MY-010604] [Repl] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=mysql2-relay-bin' to avoid this problem.
2019-01-22T23:17:34.988939Z 12 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.
2019-01-22T23:17:35.100282Z 12 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the whitelist. It is mandatory that it is added.'
2019-01-22T23:17:35.100307Z 12 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv6 localhost address to the whitelist. It is mandatory that it is added.'
2019-01-22T23:17:35.127851Z 14 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2019-01-22T23:17:35.369531Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce tcp port 33061. Port already in use?'
2019-01-22T23:17:35.369672Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.'
2019-01-22T23:17:35.405175Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061'
2019-01-22T23:18:35.455946Z 12 [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group'
2019-01-22T23:18:35.470498Z 12 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is leaving a group without being on one.'
[22 Jan 2019 23:27] Frederic Descamps
Thank you Guillaume for the bug report, I've verified it.
[23 Jan 2019 8:17] Guillaume Bienkowski
Hi Fred, 

thanks for the fast reproduction!

Would it be possible to reactivate the version 8.0.13 in your repository in the meantime?
The deb files are there, only the Package.gz file doesn't reference them.

We are deploying through ansible, and it isn't easy to change the process to manually download the deb files (which are spread all around your repo) and then install it manually. It is easier to specify the specific version (8.0.13) of the package. Since Jan 21, all our new mysql deployments are failing and have to be manually updated.
[24 Jan 2019 9:54] Giuseppe Maxia
I found the same behavior by chance.
Another quick way to reproduce it:

in a computer ***without internet connection***, make sure you have 8.0.4 generic Linux binaries expanded in $HOME/opt/mysql/8.0.4, then run the following:

dbdeployer deploy replication 8.0.14 --topology=group --concurrent.

Replication initialization fails, and the error log shows 

2019-01-24T09:53:04.786829Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce local tcp port 21540. Port already in use?'
2019-01-24T09:53:04.786872Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.'
2019-01-24T09:53:04.787027Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 21540'
~
[24 Jan 2019 11:43] Frederic Descamps
Hi Giuseppe, 

I disabled any network on my laptop (Linux), no ethernet, no wifi.

I was able to create a group as the following output illustrates it:

$ dbdeployer-1.17.0.linux deploy replication 8.0.14 --topology=group --concurrent
$HOME/sandboxes/group_msb_8_0_14/initialize_nodes
# Node 1 # reset master; CHANGE MASTER TO MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox'  FOR CHANNEL 'group_replication_recovery';
# Node 2 # reset master; CHANGE MASTER TO MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox'  FOR CHANNEL 'group_replication_recovery';
# Node 3 # reset master; CHANGE MASTER TO MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox'  FOR CHANNEL 'group_replication_recovery';

# Node 1 # SET GLOBAL group_replication_bootstrap_group=ON;
# Node 1 # START GROUP_REPLICATION;
# Node 2 # START GROUP_REPLICATION;
# Node 3 # START GROUP_REPLICATION;
# Node 1 # SET GLOBAL group_replication_bootstrap_group=OFF;
# Node 1 # select * from performance_schema.replication_group_members
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 00021415-1111-1111-1111-111111111111 | 127.0.0.1   |       21415 | ONLINE       | PRIMARY     | 8.0.14         |
| group_replication_applier | 00021416-2222-2222-2222-222222222222 | 127.0.0.1   |       21416 | ONLINE       | PRIMARY     | 8.0.14         |
| group_replication_applier | 00021417-3333-3333-3333-333333333333 | 127.0.0.1   |       21417 | ONLINE       | PRIMARY     | 8.0.14         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
# Node 2 # select * from performance_schema.replication_group_members
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 00021415-1111-1111-1111-111111111111 | 127.0.0.1   |       21415 | ONLINE       | PRIMARY     | 8.0.14         |
| group_replication_applier | 00021416-2222-2222-2222-222222222222 | 127.0.0.1   |       21416 | ONLINE       | PRIMARY     | 8.0.14         |
| group_replication_applier | 00021417-3333-3333-3333-333333333333 | 127.0.0.1   |       21417 | ONLINE       | PRIMARY     | 8.0.14         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
# Node 3 # select * from performance_schema.replication_group_members
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 00021415-1111-1111-1111-111111111111 | 127.0.0.1   |       21415 | ONLINE       | PRIMARY     | 8.0.14         |
| group_replication_applier | 00021416-2222-2222-2222-222222222222 | 127.0.0.1   |       21416 | ONLINE       | PRIMARY     | 8.0.14         |
| group_replication_applier | 00021417-3333-3333-3333-333333333333 | 127.0.0.1   |       21417 | ONLINE       | PRIMARY     | 8.0.14         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
Replication directory installed in $HOME/sandboxes/group_msb_8_0_14
run 'dbdeployer usage multiple' for basic instructions'

Could you provide us more information about OS, ipv6 present or not, interfaces.... ?

Thank you
[24 Jan 2019 12:21] Giuseppe Maxia
Fred,
Try running the setup from inside a Docker container (from the laptop without internet). This is where I found the issue
[24 Jan 2019 12:31] Giuseppe Maxia
Here is how to reproduce:

$docker pull datacharmer/mysql-sb-full

### <Disconnect internet access here>

$ docker run -ti --name dbtest --hostname dbtest datacharmer/mysql-sb-full bash
msandbox@dbtest:~$ dbdeployer deploy replication 8.0.14 --topology=group --concurrent
Creating directory /home/msandbox/sandboxes
$HOME/sandboxes/group_msb_8_0_14/initialize_nodes

### <IT TAKES A FEW MINUTES HERE>

# Node 1 # reset master; CHANGE MASTER TO MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox'  FOR CHANNEL 'group_replication_recovery';
# Node 2 # reset master; CHANGE MASTER TO MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox'  FOR CHANNEL 'group_replication_recovery';
# Node 3 # reset master; CHANGE MASTER TO MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox'  FOR CHANNEL 'group_replication_recovery';

# Node 1 # SET GLOBAL group_replication_bootstrap_group=ON;
# Node 1 # START GROUP_REPLICATION;
# Node 2 # START GROUP_REPLICATION;
# Node 3 # START GROUP_REPLICATION;
# Node 1 # SET GLOBAL group_replication_bootstrap_group=OFF;
# Node 1 # select * from performance_schema.replication_group_members
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 00021415-1111-1111-1111-111111111111 | 127.0.0.1   |       21415 | OFFLINE      |             |                |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
# Node 2 # select * from performance_schema.replication_group_members
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 00021416-2222-2222-2222-222222222222 | 127.0.0.1   |       21416 | OFFLINE      |             |                |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
# Node 3 # select * from performance_schema.replication_group_members
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 00021417-3333-3333-3333-333333333333 | 127.0.0.1   |       21417 | OFFLINE      |             |                |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
Replication directory installed in $HOME/sandboxes/group_msb_8_0_14
run 'dbdeployer usage multiple' for basic instructions'
msandbox@dbtest:~$
msandbox@dbtest:~$
msandbox@dbtest:~$ tail ~/sandboxes/group_msb_8_0_14/node1/data/msandbox.err
2019-01-24T12:23:50.022140Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Socket: '/tmp/mysqlx-31415.sock' bind-address: '::' port: 31415
2019-01-24T12:23:50.917708Z 10 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.
2019-01-24T12:23:51.010743Z 12 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the whitelist. It is mandatory that it is added.'
2019-01-24T12:23:51.010768Z 12 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv6 localhost address to the whitelist. It is mandatory that it is added.'
2019-01-24T12:23:51.020734Z 14 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2019-01-24T12:23:51.055881Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce local tcp port 21540. Port already in use?'
2019-01-24T12:23:51.055989Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.'
2019-01-24T12:23:51.056130Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 21540'
2019-01-24T12:24:51.037273Z 12 [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group'
2019-01-24T12:24:51.037350Z 12 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is leaving a group without being on one.'
[24 Jan 2019 13:59] Guillaume Bienkowski
Fred, thank you for your work, may I reiterate my question above: is it possible to re-add the 8.0.13 packages in your repository while this bug is investigated?

We cannot install our nodes right now and have to manually intervene on all our nodes..
[28 Jan 2019 9:44] Margaret Fisher
Posted by developer:
 
Changelog entry added for MySQL 8.0.15:

Group Replication was unable to function in the 8.0.14 release
of MySQL Server if IPv6 support was disabled at the operating
system level, even if the replication group did not use any IPv6
addresses. The issue is fixed by this release of MySQL Server, 8.0.15.
[28 Jan 2019 10:21] Guillaume Bienkowski
Thank you Margaret. Do you have an ETA for the 8.0.15?
[2 Feb 2019 0:13] Roel Van de Paar
8.0.15 Released.
[11 Feb 2019 20:16] Abdullah Zarour
I have test the release 8.0.15 and still same issue and went back to 8.0.13 and successfully created the GR
[19 Feb 2019 12:17] Tiago Jorge
Hi Abdullah Zarour, thank you for your report!

Do you have a reproducible scenario and failure logs?