| Bug #85026 | GROUP MEMBERS BECOMES 'UNREACHABLE' WHEN SERVER WITH LOWER VERSION TRIES TO JOIN | ||
|---|---|---|---|
| Submitted: | 16 Feb 2017 17:43 | Modified: | 4 Sep 2017 10:46 |
| Reporter: | Dhruthi Komarlu Vasudeva Murthy | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Group Replication | Severity: | S3 (Non-critical) |
| Version: | 8.0.1 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[16 Feb 2017 17:58]
Dhruthi Komarlu Vasudeva Murthy
Posted by developer: Steps to run test: (Test files are attached) cp groupReplicationTest.xml mysql-test-extra/mysql-test/jet/src/com/sun/mysql/jet/tests/replication/ cp GroupReplicationTestCase.java mysql-test-extra/mysql-test/jet/src/com/sun/mysql/jet/testcases/replication/ To run test: mysql-test/jet$ ant runjet -Dxmlfile=com/sun/mysql/jet/tests/replication/groupReplicationTest.xml -Dpropfile=jet.properties where, mysql-test/jet$ cat jet.properties jet.testlogpath= /home/dhruthi/mysql-te/mysql-test/jet/JET-logs jet.installpath.mysql@1= path to 8.0.1 binaries jet.installpath.mysql@2= path to 8.0.1 binaries jet.insta11path.mysql@3= path to 5.7.18 binaries
[17 Feb 2017 10:43]
Dhruthi Komarlu Vasudeva Murthy
Posted by developer:
Result of performance_schema.replication_group_members on node1, node2:
node1> start and bootstrap group.
node2> CHM; start GR;
node1> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | c36c9b34-f4fa-11e6-b854-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B | 3309 | ONLINE |
| group_replication_applier | cb6d7c33-f4fa-11e6-8cc8-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B | 3310 | ONLINE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
node2> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | c36c9b34-f4fa-11e6-b854-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B | 3309 | ONLINE |
| group_replication_applier | cb6d7c33-f4fa-11e6-8cc8-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B | 3310 | ONLINE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
node3> CHM;start GR; --> this fails.
ode1> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | c36c9b34-f4fa-11e6-b854-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B | 3309 | ONLINE |
| group_replication_applier | cb6d7c33-f4fa-11e6-8cc8-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B | 3310 | UNREACHABLE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
node2>select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | c36c9b34-f4fa-11e6-b854-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B | 3309 | UNREACHABLE |
| group_replication_applier | cb6d7c33-f4fa-11e6-8cc8-b86b23a4b9b9 | dhruthi-PORTEGE-Z30-B | 3310 | ONLINE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
I am using debug builds for the testing.
Tested on commit:
commit 3c87f9804ab4ef8dbfeb0d33939fd9ee4073a573
Author: Deepa Dixit <deepa.dixit@oracle.com>
Date: Thu Feb 16 12:10:14 2017 +0530
Bug#18184868: CANNOT USE BOOTSTRAP VARIABLES IN -MASTER.OPT FILE
regards,
Dhruthi
[21 Apr 2017 7:44]
Narendra Singh Chauhan
Posted by developer: Hi Dhruthi, I too hit this issue. Just for the information, following scenarios are blocked. Prerequisite: group_replication_allow_local_disjoint_gtids_join is set to ON. 1. Two members (8.0.2) and if a new member (5.7.19) tries to join the group. ISSUE#1: It will fail to join the group always. ISSUE#2: But, strange part is one of the existing member from 8.0.2 will become UNREACHABLE. Thus, we need to do force_members to again re-configure the group. 2. [Same as #1, but, may creates confusion to user.] If we have a member (5.7.19) and new two members (from 8.0.2) joins a group can be setup with 3 members. But, if somehow 5.7.19 members stops, then, it won't be possible for it to re-join the group. Doesn't look good from a perspective that a previous member which was working fine earlier is not able to re-join. Regards, Narendra
[4 Sep 2017 10:46]
David Moss
Posted by developer: Thank you for your feedback, this has been fixed in upcoming versions and the following was added to the 8.0.3 changelog: Joining a member running a lower version to a group running a higher version resulted in the members running the higher version becoming unreachable.

Description: Consider a group of two 8.0.1 members, when 5.7 server tries to join the group, it fails on 'start GR' query as expected. However, existing group members goes to UNREACHABLE state. How to repeat: Consider 3 servers, {node1, node2 - 8.0} and {node3 - 5.7} node1> start and bootstart group node2> CHM;start GR; now node1, node2 are part of group. node3> CHM;start GR; --> this fails with , ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log. node2>select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ | group_replication_applier | 94a5b11c-f466-11e6-8fd0-34028669d426 | localhost | 13000 | UNREACHABLE | | group_replication_applier | 94a70a48-f466-11e6-8f88-34028669d426 | localhost | 13001 | ONLINE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ node1>select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ | group_replication_applier | 94a5b11c-f466-11e6-8fd0-34028669d426 | localhost | 13000 | ONLINE | | group_replication_applier | 94a70a48-f466-11e6-8f88-34028669d426 | localhost | 13001 | UNREACHABLE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ attaching the server error logs.