Bug #85026 GROUP MEMBERS BECOMES 'UNREACHABLE' WHEN SERVER WITH LOWER VERSION TRIES TO JOIN
Submitted: 16 Feb 2017 17:43 Modified: 4 Sep 2017 10:46
Reporter: Dhruthi Komarlu Vasudeva Murthy Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S3 (Non-critical)
Version:8.0.1 OS:Any
Assigned to: CPU Architecture:Any

[16 Feb 2017 17:43] Dhruthi Komarlu Vasudeva Murthy
Description:
Consider a group of two 8.0.1 members, when 5.7 server tries to join the group, it fails on 'start GR' query as expected. However, existing group members goes to UNREACHABLE state.

How to repeat:
Consider 3 servers, {node1, node2 - 8.0} and {node3 - 5.7}

node1> start and bootstart group

node2> CHM;start GR; now node1, node2 are part of group.

node3> CHM;start GR; --> this fails with ,
       ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more  details on error log.

node2>select * from performance_schema.replication_group_members;                                                                             +---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 94a5b11c-f466-11e6-8fd0-34028669d426 | localhost   |       13000 | UNREACHABLE  |
| group_replication_applier | 94a70a48-f466-11e6-8f88-34028669d426 | localhost   |       13001 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+

node1>select * from performance_schema.replication_group_members;                                                                             +---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 94a5b11c-f466-11e6-8fd0-34028669d426 | localhost   |       13000 | ONLINE       |
| group_replication_applier | 94a70a48-f466-11e6-8f88-34028669d426 | localhost   |       13001 | UNREACHABLE  |
+---------------------------+--------------------------------------+-------------+-------------+--------------+

attaching the server error logs.
[16 Feb 2017 17:58] Dhruthi Komarlu Vasudeva Murthy
Posted by developer:
 
Steps to run test:
(Test files are attached)

cp groupReplicationTest.xml mysql-test-extra/mysql-test/jet/src/com/sun/mysql/jet/tests/replication/

cp GroupReplicationTestCase.java mysql-test-extra/mysql-test/jet/src/com/sun/mysql/jet/testcases/replication/

To run test:

mysql-test/jet$ ant runjet -Dxmlfile=com/sun/mysql/jet/tests/replication/groupReplicationTest.xml -Dpropfile=jet.properties

where,

mysql-test/jet$ cat jet.properties
jet.testlogpath= /home/dhruthi/mysql-te/mysql-test/jet/JET-logs
jet.installpath.mysql@1= path to 8.0.1 binaries
jet.installpath.mysql@2= path to 8.0.1 binaries
jet.insta11path.mysql@3= path to 5.7.18 binaries
[17 Feb 2017 10:43] Dhruthi Komarlu Vasudeva Murthy
Posted by developer:
 
Result of performance_schema.replication_group_members on node1, node2:

node1> start and bootstrap group.

node2> CHM; start GR;

node1> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST           | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | c36c9b34-f4fa-11e6-b854-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B |        3309 | ONLINE       |
| group_replication_applier | cb6d7c33-f4fa-11e6-8cc8-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B |        3310 | ONLINE       |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+

node2> select * from performance_schema.replication_group_members;                                                                             +---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST           | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | c36c9b34-f4fa-11e6-b854-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B |        3309 | ONLINE       |
| group_replication_applier | cb6d7c33-f4fa-11e6-8cc8-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B |        3310 | ONLINE       |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+

node3>  CHM;start GR; --> this fails.

ode1> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST           | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | c36c9b34-f4fa-11e6-b854-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B |        3309 | ONLINE       |
| group_replication_applier | cb6d7c33-f4fa-11e6-8cc8-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B |        3310 | UNREACHABLE  |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+

node2>select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST           | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | c36c9b34-f4fa-11e6-b854-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B |        3309 | UNREACHABLE  |
| group_replication_applier | cb6d7c33-f4fa-11e6-8cc8-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B |        3310 | ONLINE       |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+

I am using debug builds for the testing.

Tested on commit:

commit 3c87f9804ab4ef8dbfeb0d33939fd9ee4073a573
Author: Deepa Dixit <deepa.dixit@oracle.com>
Date:   Thu Feb 16 12:10:14 2017 +0530

    Bug#18184868: CANNOT USE BOOTSTRAP VARIABLES IN -MASTER.OPT FILE
    

regards,
Dhruthi
[21 Apr 2017 7:44] Narendra Singh Chauhan
Posted by developer:
 
Hi Dhruthi,

I too hit this issue. Just for the information, following scenarios are blocked.
Prerequisite: group_replication_allow_local_disjoint_gtids_join is set to ON.

1. Two members (8.0.2) and if a new member (5.7.19) tries to join the group. ISSUE#1: It will fail to join the group always.
 ISSUE#2: But, strange part is one of the existing member from 8.0.2 will become UNREACHABLE. Thus, we need to do force_members to again re-configure the group.

2. [Same as #1, but, may creates confusion to user.] If we have a member (5.7.19) and new two members (from 8.0.2) joins a group can be setup with 3 members. But, if somehow 5.7.19 members stops, then, it won't be possible for it to re-join the group. Doesn't look good from a perspective that a previous member which was working fine earlier is not able to re-join.

Regards,
Narendra
[4 Sep 2017 10:46] David Moss
Posted by developer:
 
Thank you for your feedback, this has been fixed in upcoming versions and the following was added to the 8.0.3 changelog:
Joining a member running a lower version to a group running a higher version resulted in the members running the higher version becoming unreachable.