Bug #116513 Clone plugin does not allow cross-architectural cloning - docs don't explain why
Submitted: 30 Oct 2024 22:37 Modified: 1 Jun 13:26
Reporter: Simon Mudd (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Clone Plugin Severity:S4 (Feature request)
Version:8.0, 8.4+ OS:Any
Assigned to: CPU Architecture:Any

[30 Oct 2024 22:37] Simon Mudd
Description:
I notice when trying to clone to an ARM OCI instance from a PC I own:

mysql> clone instance from 'some_user'@'1.2.3.4':3306 identified by 'somepassword' ;
ERROR 3866 (HY000): Clone Donor platform: x86_64 is different from Recipient platform: aarch64.

ok. So let's go and find out why.

- https://dev.mysql.com/doc/refman/8.4/en/innodb-architecture.html has some nice diagrams
- https://dev.mysql.com/doc/refman/8.4/en/innodb-on-disk-structures.html is also interesting
- https://dev.mysql.com/doc/refman/8.4/en/clone-plugin-remote.html is also interesting
- https://dev.mysql.com/doc/refman/8.4/en/clone-plugin-remote.html#clone-remote-prerequisite... is also interesting

I see no clear indication that:
- MySQL on-disk storage may be different depending on the CPU architecture and what those physical on-disk differences look like and where they are.
- that native cloning may be affected by this.

Do on-disk structures actually vary these days between architectures? Maybe in the past between 32-bit and 64-bit that might be true. Is that still the case now? If so where is it documented.  Is it applicable in all cases? If not should MySQL be able to figure out where the architectural differences are not important?

If this is no longer true then consider allowing native cloning to actually work where there are no actual differences. This makes processes like native cloning much more straight forward.

I believe that replication is architecturally safe. Is that also not true?  I'm sort of curious now and unsure.

- Native cloning is one of the fastest "over the network" methods I'm aware of  to copy data between instances and can at least max out a 10gbe network card. Most other methods tend to be far slower.

- MySQL shell does a parallel logical dump or load so I guess is not too bad but I believe it's still slower than native cloning.

How to repeat:
See description.

Suggested fix:
Consider the following:

- improve documentation to make this cross-architectural thing more explicit. These days x86_64 may be the most popular platform but aarm64 is gaining traction so cross architectural cloning may become more popular.

- improve documentation that talks about MysQL on-disk formats and so-on to highlight the differences if they exist, what they are and on which types of files this is relevant.

- if possible fix native cloning to handle architectural differences (where possible). If you know how to fix this perhaps this can be handled as data is copied over?  That would add flexibility.
[30 Oct 2024 22:38] Simon Mudd
The example provided earlier was between 2 servers running 8.4.3. I've also seen this on 8.0 and assume that 9.X behaves the same way though I haven't verified this.
[31 Oct 2024 5:25] MySQL Verification Team
Hello Simon,

Thank you for the feature request!

regards,
Umesh
[31 Oct 2024 9:00] Simon Mudd
Note this is a bug report as documentation is incomplete. Do I need to make a separate request for that?

[ It's a feature request to improve the native clone functionality. ]
[1 Jun 13:26] Simon Mudd
Related: bug#118323 to enhance native cloning to make it work cross-architecture (where possible).