Bug #103619 Improve flexibility of switching 8.0 versions
Submitted: 7 May 2021 7:02 Modified: 10 May 2021 7:42
Reporter: Simon Mudd (OCA) Email Updates:
Status: Verified Impact on me:
Category:MySQL Server: Installing Severity:S4 (Feature request)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: 8.0, documentation, downgrades, flexibility, on-disk format, replication, SQL, upgrades

[7 May 2021 7:02] Simon Mudd
In MySQL 5.7 and earlier, within a major version, it was possible to swap one version with another as all versions were "almost" compatible.  Exceptions were usually to resolve significant issues that required a fix, but this was very rare.

The 8.0 development model provides incremental feature upgrades some of which (but not all) are not backwards compatible. The incremental addition of new features has enhanced the product allowing the developers to provide new features and not be restrained by the 100% compatibility rule followed previously.  Part of the change has made the server upgrade the data dictionary version to match the server version and refuse to work if it sees the on-disk data dictionary version corresponds to a version later than the binaries being run. The valid concern here is the on disk data may not be understood.

However, this change has meant that downgrades are impossible. Once you start a new MySQL 8.0.XX version against the data dictionary it becomes impossible to use an older binary as it will refuse to startup.

In practice not all "upgrades" actually trigger incompatible on-disk changes, the number of such 8.0 upgrades with such changes or changes that might break replication are minimal so what's happened is there is an artificial limit on returning to older binaries.

Why does this matter?

It is good to test a new version when it's released. This may be necessary for security reasons to want to upgrade. However, new releases may have bugs. While these bugs may affect few people those of us managing large server fleets, where the servers are busy and where we have varied workloads are more likely to bump into these issues.  We also have to recognise that bugs happen and it's hard for even Oracle to test all scenarios.

So if we hit a bug, and the on-disk format is unchanged, and the previous version(s) don't suffer this it is much more convenient to switch back to the previous working version instead and use that while the issue is addressed. The old version may have other bugs but if they don't affect us or we can better work around them then it still may be better to go back for what would be expected to be a short period of time.

With 8.0 that is no longer possible and I have seen issues with that on a number of occasions.

I think the developers forget an important aspect of upgrades (or downgrades): the amount of work required to do this.  Copying data takes a long time and if it's impossible because of the current rules you can not use the new server to go back. Native cloning has the same restrictions: it works really well but the strict versioning rules means it can often not be used.

Configuration changes of the server on the other hand are much simpler to implement: you often need to change settings to use new features, so removing those settings to go back to previous behaviour is easy.

I would like to see 8.0 being modified to be more flexible to allow such downgrades when such issues occur and there are no incompatible on-disk, or incompatible breakage-triggering replication changes.

This is *not* a request for Oracle to break their incremental change model which is good and provides a faster rate of improvements reaching the server.

If you want users to use new features then you want them to upgrade. People will upgrade if the risk of doing this is perceived to be lower, so with the current setup the risk is higher: you can not ever downgrade, so the end result is it likely puts the brake on people upgrading, just the opposite of what you want.

Again, incompatible changes will happen from time to time. These need to be treated with more care, but I think that the number of these in the 8.0 GA releases from 8.0.11 to 8.0.24 has not ben that large. There have been changes but most of these changes affect syntax, configuration but do not affect the ondisk format or prevent replication working between 2 8.0.X servers.

So keep up the good work but allow us to downgrade when sometimes it may be necessary.

Really it's not the "downgrade part" that's interesting: it's the simplification and flexibility of being able to swap "equivalent" binaries to workaround issues, to test whether the same issue manifests itself on a different version or disappears on another. It will not always be possible to do this. In many cases it will and a more flexible way of managing this would help considerably.

How to repeat:
* Use MySQL a lot
* upgrade frequently
* catch a bug and want to downgrade and be unable to do that

Suggested fix:
I don't think this requires that much change, I suspect most of this is about labelling of the data dictionary, and documenting such changes.

Things to consider:

* Document incompatible on-disk changes right at the beginning of each change log as looking for such changes is really hard. I spent a while searching for this information from 8.0.15 to 8.0.24 and could not find clear indications of this without stopping and reading through the WHOLE document. That should not be necessary.  Something as simple as: incompatible disk changes: none, or "changes to redo log format will not be understood by previous versions of MySQL", etc

* Document any incompatibilities in [group] replication which prevent an old version of the server replicating from a new one. (for the same major version.) If new features are not enabled, or can be disabled then these incompatibilities can be worked around so are easy to solve.

* SQL incompatibilities/syntax changes are not really a concern. In most cases code can be adapted to remove the new features. In many cases many users may not even use the new syntax so going back is not an issue. If new features are used then clearly they must be turned off or not used by the developer/user if downgrading or changing versions. Again this is "easy" to achieve.  It is good to document SQL incompatibilities (syntax changes) just so that users can see this.  Current documentation seems to do that pretty well.

* The data dictionary seems to be labelled with the version of the binaries that use it. Consider changing this to label the data dictionary with the latest version where an incompatible on-disk change has taken place.  Later versions of mysqld would use the SAME data-dictionary label until a new incompatible change is made.  Any versions of mysql which use the same labelled/base version should understand what is on disk (as there's no difference) and be able to work from it. Cloning between such versions (file system copying) should work fine and native cloning should also work as the on-disk data format does not change.
[10 May 2021 7:42] MySQL Verification Team
Hello Simon,

Thank you for the reasonable feature request!