Bug #110537 MySQL cluster fail to auto recover
Submitted: 29 Mar 2023 4:02 Modified: 31 Mar 2023 0:37
Reporter: zetang zeng (OCA) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S1 (Critical)
Version:5.7.40 OS:Any
Assigned to: MySQL Verification Team CPU Architecture:Any
Tags: group replicaiton

[29 Mar 2023 4:02] zetang zeng
Description:
Assume we have three nodes cluster at ip1, ip2, ip3 and with some network failure sometimes.

In normal cases, if member in ip1 was expelled, then member in ip2 & ip3 fail to connect to each other, member in ip2 & ip3 will block and wait for connection then cluster will auto recover.

But in buggy case, member in ip1 & ip2 both are expelled and cluster fails to recover. It makes cluster out of service and need human to reboot it.

Timeline of buggy case:
- 52:47 member in ip1 expelled: [ERROR] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
- 53:23 member in ip1 try to join cluster but failed: [ERROR] Plugin group_replication reported: 'There was a previous plugin error while the member joined the group. The member will now exit the group.'
- 53:36 member in ip2 expelled too! 
...
later, network recovered, but cluster is down because only one member in cluster and other two member not joined automatically.

How to repeat:
Still trying to repeat

Suggested fix:
member in ip2 & ip3 should block and wait for connection then cluster will auto recover
[29 Mar 2023 8:01] zetang zeng
More detailed timeline in buggy case:

- 52:47 member in ip1 expelled: [ERROR] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
- 53:23 member in ip1 try to join cluster but failed: [ERROR] Plugin group_replication reported: 'There was a previous plugin error while the member joined the group. The member will now exit the group.'
- 53:23 member in ip3 fail to connect to member in ip2: [Warning] Plugin group_replication reported: 'Members removed from the group: ip2:3406'
[Note] Plugin group_replication reported: 'Group membership changed to ip1:3406, ip3:3406 on view 16783545669642702:22.'

- 53:36 member in ip2 found itself expelled too!
[31 Mar 2023 0:37] MySQL Verification Team
Hi,

I cannot reproduce this. Can you provide some kind of reproducible test case? Whatever I try the system behaves as expected.