MySQL Bugs: #103111: Improve documentation for native cloning and suggest configuration improvements

Bug #103111	Improve documentation for native cloning and suggest configuration improvements
Submitted:	25 Mar 2021 14:22	Modified:	26 Mar 2021 4:23
Reporter:	Simon Mudd (OCA)	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Documentation	Severity:	S3 (Non-critical)
Version:	8.0	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	native_cloning

Description:
I have dedicated servers used for cloning and would like to ensure I use them as efficiently and quickly as possible.
Native cloning provides good performance against other cloning methods so I am using it.

I'm currently using it with default settings.

However, I do not see the source or destination servers fully loaded or the network 100% busy so there's room for improvement.

I could not find any good details of the best way to adjust settings.

https://dev.mysql.com/doc/refman/8.0/en/clone-plugin-options-variables.html provides information on the settings but not ideal plans for improving performance.

I would like to see possible suggestions on improving cloning speed if other usage on the source server is considered secondary.

How to repeat:
Use native cloning and think about how to improve performance.
Then ask your self what settings may be more appropriate to get better values.

Some sort of practical advice on settings might be good.

On a practical note I increased the maximum_concurrency to 32 and turned off auto-tuning and started cloning and only saw 3 native_clone threads running. That seems odd. Perhaps a minimum_concurrency setting should be available to start from a higher value?

The clone_buffer_size description is confusing it talks about local and remote, but it seems to me that this is possibly only used on the source or destination, and perhaps by increasing this value significantly might help. Clarification here would be good.

I wanted to change the configuration so attempted to kill the native_clone threads on the source but see the destination did not notice. It looks like no handling of interrupting the clone operation is contemplated as I think the source should tell the destination it's closing the connection rather than let the destination handle this (slowly) via a potentially long network timeout.

Optionally logging progress would be good but the information provided is minimal:

On the source:
2021-03-25T14:06:25.575103Z 62 [Note] [MY-013458] [InnoDB] Stage progress: 4% completed.
2021-03-25T14:08:29.816975Z 62 [Note] [MY-013458] [InnoDB] Stage progress: 5% completed.

How about logging:
- number of (active) threads, since this can be changing
- total data copied so far
- number of files/tables copied so far
- indication of transfer rate so far

Is it possible to change some of the source settings while a clone operation is ongoing? This is not clear but might be useful and with appropriate logging it would be easier to see changes in the throughput.

Should we be looking at using most CPU cores to do the cloning if the server is dedicated to doing that?

The use of network compression may speed things up at the cost of CPU load, unless on disk encryption is used. Some thoughts on "normal usage" and how much this can help would be interesting.

Remember if you're copying over a dataset that's not small, say 1 TB+ the configuration differences may make a significant improvement in clone times so providing some pointers on the experience you have had in different test scenarios would be interesting.

Suggested fix:
Add some more details.

Hello Simon,

Thank you for the documentation enhancement request.

regards,
Umesh