Bug #115051 Improve behaviour of GR primary promotion selection between members
Submitted: 17 May 2024 15:22 Modified: 18 May 2024 10:26
Reporter: Simon Mudd (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S4 (Feature request)
Version:8.0, 8.4 OS:Any
Assigned to: CPU Architecture:Any

[17 May 2024 15:22] Simon Mudd
Description:
MySQL, when choosing a new primary in a GR cluster, requires that the primary can ONLY be a server with the LOWEST MySQL version.

Given work is being done to make all new 8.0 versions to be equivalent and all 8.4 versions to be equivalent this limitation should be removed.

- upgrading / downgrading binaries is now possible in 8.0.35+ (previously only upgrading was possible)
- native cloning from a different source version will be possible for versions of MySQL >= 8.0.37

I believe (but haven't seen 8.4.1 yet) that the same is expected with all 8.4 versions too.

Why is this important?
- greater choice of the GR member that can be made primary is important
- all other members of the cluster are candidates
- when doing a rolling upgrade with a single leader you HAVE to remove the member from the cluster WHILE it's serving write traffic to users when upgrading the last server. Impact is greater than switching the primary while maintaining the server running, and then as a secondary removing the member while handling no customer traffic.
- testing a new version (perhaps prior to a full upgrade) is easier as I can upgrade a single node to a new version, test it behaves correctly as a secondary, promote it to be a primary, confirm the behaviour as a primary is correct.  If the behaviour is problematic I can revert back to an older version primary and report the issue seen.

The current selection process requires you when upgrading to be aware of the versions of all cluster members because the choice of potential primaries will be reduced as you proceed with the upgrades. If you do not want to treat all primaries as equal (they may be located in different places: racks/DCs/AZs etc) then you need to pay more careful attention to this.  That should not be necessary.

How to repeat:
Spin up 3 GR members running any 8.0 version. Upgrade one of them to a newer version. Try to promote the new version to be primary and see it's not allowed.

The same is expected to apply for 8.4, or will do once 8.4.1 is released against a 8.4.0 GR cluster.

Suggested fix:
Given the ON DISK format is now equivalent or compatible, and
it is possible to CLONE between any 8.0.37+ version, so up or down:

Modify the GR primary selection mechanism to allow ANY member of a GR cluster comprising members of 8.0.37+ or any members of a GR cluster comprising member of 8.4.0+ to be chosen as the new primary.
[17 May 2024 17:53] Simon Mudd
Note: if the primary is not explicitly chosen the existing rules should be applied for hosts of the same version which I believe is based on priority and then the performance_schema.replication_group_members.member_id or equivalent.
[18 May 2024 10:26] MySQL Verification Team
Hello Simon,

Thank you for the feature request!

regards,
Umesh
[24 Jan 11:43] MySQL Verification Team
Posted internally by developer:

 IN-PLACE SECONDARY UPGRADES AND DOWNGRADES

In Group Replication, when a group is running an LTS (Long Term
Support) version, in-place upgrades and downgrades are possible within
the LTS series.

This means that if a member with a different patch version joins the
group, it can join as a secondary member regardless of its patch
version number.

If the joining member has a higher version number, it will not be
promoted to primary on a failover/switchover.

PRIMARY PROMOTION DOWNGRADE

The promotion rules allow a primary to be downgraded,
i.e. a secondary with a lower version can become the new primary.
In fact, after the promotion, it is guaranteed that the primary
has the lowest version of the group.

The promotion rules do not allow a primary to be upgraded to a
version higher than any secondary of the group.

LTS versions are supported for an extended period and they are
replication safe, even from a higher version to a lower version within
the same LTS series. However, if there is a bug or an exception to
this rule it may cause problems. To ensure that the effects of any
such bug or exception are limited to the joining member and not effect
the whole group, primary election does not allow higher version number
members to become the primary.