Bug #86957 SETTING gr_force_members HITS ERROR1231 SOMETIMES WHILE UNBLOCKING THE GROUP
Submitted: 5 Jul 2017 12:49 Modified: 4 Jul 2018 12:00
Reporter: Dhruthi Komarlu Vasudeva Murthy Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S3 (Non-critical)
Version:5.7.20 OS:Any
Assigned to: CPU Architecture:Any

[5 Jul 2017 12:49] Dhruthi Komarlu Vasudeva Murthy
Description:
Unblocking group with group_replication_force_members will throw error sometimes.

Scenario:
1. Consider a group with 5 members.
2. Disconnect network connection at 3 members (Say M1, M2, M3)
3. Now group sees has only 2 members M4, M5 as ONLINE and it is blocked due to majority loss.
4. try to unblock group by setting group_replication_force_members on M4.
   --> This step will throw error:
ERROR 1231 (42000): Variable 'group_replication_force_members' can't be set to the value of 'brage09:30107,brage17:30109'
Note that we will hit this error sometimes and even if we re-try to set the variable, it fails with same error.

From M4 (brage09 here) error log,
....
2017-07-05T12:12:15.092705Z 25 [Note] Plugin group_replication reported: 'The group_replication_force_members value 'brage09:30107,brage17:30109' was set in the group communication interfaces'
2017-07-05T12:12:16.092067Z 0 [Note] Plugin group_replication reported: 'getstart group_id dd88d2dd'
2017-07-05T12:12:16.092313Z 0 [Note] Plugin group_replication reported: 'getstart group_id dd88d2dd'
2017-07-05T12:12:16.092398Z 0 [Note] Plugin group_replication reported: 'getstart group_id dd88d2dd'
2017-07-05T12:12:18.784653Z 0 [Note] Plugin group_replication reported: 'getstart group_id dd88d2dd'
2017-07-05T12:12:18.784773Z 0 [Note] Plugin group_replication reported: 'getstart group_id dd88d2dd'
2017-07-05T12:12:18.784838Z 0 [Note] Plugin group_replication reported: 'getstart group_id dd88d2dd'
2017-07-05T12:12:18.784899Z 0 [Note] Plugin group_replication reported: 'getstart group_id dd88d2dd'
2017-07-05T12:12:18.784968Z 0 [Note] Plugin group_replication reported: 'getstart group_id dd88d2dd'
2017-07-05T12:12:21.681652Z 0 [Note] Plugin group_replication reported: 'getstart group_id dd88d2dd'
2017-07-05T12:12:25.613732Z 0 [Note] Plugin group_replication reported: 'Failure reading from fd=-1 n=18446744073709551615'
2017-07-05T12:12:25.613777Z 0 [Note] Plugin group_replication reported: 'Failure reading from fd=-1 n=18446744073709551615'
2017-07-05T12:12:25.613790Z 0 [Note] Plugin group_replication reported: 'Failure reading from fd=-1 n=18446744073709551615'
2017-07-05T12:13:15.092951Z 25 [ERROR] Plugin group_replication reported: 'Timeout on wait for view after setting group_replication_force_members value 'brage09:30107,brage17:30109' into group communication interfaces'
2017-07-05T12:13:19.056502Z 28 [Note] Plugin group_replication reported: 'Going to wait for view modification'
2017-07-05T12:13:49.056984Z 0 [ERROR] Plugin group_replication reported: '[GCS] Timeout while waiting for the group communication engine to exit!'
2017-07-05T12:13:49.057076Z 0 [ERROR] Plugin group_replication reported: '[GCS] The member has failed to gracefully leave the group.'
....

How to repeat:
please find attached diff. 
(This test must be run in distributed mode.)

$ git clone git@myrepo.no.oracle.com:mysql-test-extra

$ cd mysql-test-extra/mysql-test/jet

(apply attached patch)
$ git apply wl9279_majorityloss.diff

(edit nw_test.properties and specify appropriate values for,
 jet.testlogpath= /log/path
 jet.installpath.mysql= /path/to/mysql/binaries )

$ ant clean;ant all

$ ant run -Dxmlfile=com/sun/mysql/jet/tests/replication/groupReplicationNetworkDropTest.xml -Dpropfile=nw.propertes

Logs can be found at jet-log folder mentioned in "jet.testlogpath" in properties file.

Note: Since the reported issue is sporadic, please run the test 2-3 times to hit the issue.

Tested on commit:
(on 5.7)
commit c3d1fa60a78a8b98465c6c08d9999aa97266eee5
Author: Arun Kuruvila <arun.kuruvila@oracle.com>
Date:   Wed Jul 5 11:01:20 2017 +0530

    Bug#25380000: SOME WARNINGS APPEAR IN DUMP FROM MYSQLDUMP

(on trunk)
commit 016a11ccb2c670783a9efd194ec1231da73a8e21
Author: Marc Alff <marc.alff@oracle.com>
Date:   Mon Jul 3 10:53:52 2017 +0200

    Bug#26162562 DISABLING PERFORMANCE_SCHEMA.SETUP_OBJECT DROP TABLE STATS
[6 Jul 2017 15:47] Tiago Jorge
Posted by developer:
 
Hi,

I assume that you talking about networkDropWithMajorityLoss, correct?

Just a quick question: How is the network physically dropped? Do you ensure that there is no connectivity at all to the old members?

Regards,

Tiago J.
[4 Jul 2018 12:00] David Moss
Posted by developer:
 
Thank you for your feedback, this has been fixed in upcoming versions and the following was added to the 8.0.12 changelog:
Using group_replication_force_members to unblock a group, for example after losing majority, sometimes failed with error 1231.