Bug #97987 | split brain:1 node isolated (SECONDARY-ONLINE) but unreachable with RO router | ||
---|---|---|---|
Submitted: | 14 Dec 2019 12:20 | Modified: | 18 Dec 2019 13:46 |
Reporter: | lionel mazeyrat | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Router | Severity: | S2 (Serious) |
Version: | 8.0.18 | OS: | Windows |
Assigned to: | MySQL Verification Team | CPU Architecture: | x86 |
[14 Dec 2019 12:20]
lionel mazeyrat
[14 Dec 2019 12:40]
lionel mazeyrat
I forgot to mention : group_replication_exit_state_action = READ_ONLY
[18 Dec 2019 12:05]
Frederic Descamps
Hi Lionel, First, I would like to correct the term "split-brain", the situation you are describing is a network partition. In fact, the remaining node is in a minority partition as it doesn't reach quorum (1 of 3). So now that, this is clear, MySQL Router will NEVER allow to use a member/node being in a minority partition. So this is not a bug but how it MUST work. Regards,
[18 Dec 2019 12:17]
Kenny Gryp
* The subject mentions 'Split Brain'. However, from what I read in the bug description, there is no split brain at all, there is just a network partition where 1 member cannot see the other members. Please refer to that as a network partition. A split brain refers to having 2 partitions accepting writes, resulting in inconsistent datasets * MySQL Router removes all connections to a member that is network partitioned and not part of the minority group. That is the only possible behavior at this moment. `group_replication_exit_state_action` has no impact on this. Also, for network partition handling with full automatic rejoin, as best practice, I suggest changing only these settings: ``` group_replication_aurorejoin_tries=3 group_replication_member_expel_timeout=5 ``` Please look at https://www.slideshare.net/Grypyrg/mysql-innodb-cluster-new-features-in-80-releases-best-p... especially the Network Partition Handling chapter from slide 46 to 52, it explains how this all works and what the best practice is.