Bug #86359 Avoid setting group_replication_force_members in a partition that holds majority
Submitted: 17 May 2017 14:53 Modified: 15 Sep 2017 14:52
Reporter: Tiago Jorge Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S3 (Non-critical)
Version:5.7.18 OS:Any
Assigned to: CPU Architecture:Any

[17 May 2017 14:53] Tiago Jorge
Description:
In Group Replication documentation (https://dev.mysql.com/doc/refman/5.7/en/group-replication-unblocking-a-partition.html) one can read:
"When forcing a new membership configuration, make sure that whatever servers are going to be forced out of the group are indeed stopped."

Currently, Group Replication allows one to change this variable, which brings sporadic instability to tests that do not respect this premise. The problem is more visible in slower machines since one will have the installed forced configuration together with a node that should be dead for a long time before it goes away.

This can also be cause for a load of pain if done by a DBA, even if done in an involuntary way.

How to repeat:
Run gr_force_peer_addresses_3_to_2 in slow machines in parallel with different tests to see it fail sporadically.

Suggested fix:
When setting group_replication_force_members, check if the member where the operation is being performed does belongs to a partition that holds a majority. In that case, setting group_replication_force_members should fail.
[15 Sep 2017 14:52] David Moss
Posted by developer:
 
Thank you for your feedback, this has been fixed in upcoming versions and the following was added to the 5.7.20 / 8.0.3 changelog:
group_replication_force_members could be used in situations where the group was working properly, in other words a majority was reachable. This incorrect use could cause instability in the group. Therefore, its use has been restricted to the scenario for which it was created: for a subset of previous membership when a majority of the members are unreachable.