Bug #110578 optimize for prefetch trx sys page
Submitted: 31 Mar 2023 12:43 Modified: 31 Mar 2023 13:09
Reporter: alex xing (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[31 Mar 2023 12:43] alex xing
Description:
The native logic is that trx->id is persisted in page_id_t(TRX_SYS_SPACE, TRX_SYS_PAGE_NO) for every 256 incremented.

This process is operated within trx_sys->mutex. If the page is not in buffpool, it needs to be read from disk, which extended locking time.

So we can anticipate this behavior and read the required page in advance, reducing the locking time.

How to repeat:
just read the code

Suggested fix:
prefetch page_id_t(TRX_SYS_SPACE, TRX_SYS_PAGE_NO) , just as the below patch
[31 Mar 2023 12:44] alex xing
a simple patch to describe the optimization

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: prefetch_optimize.patch (text/plain), 2.97 KiB.

[31 Mar 2023 13:09] MySQL Verification Team
Hi Mr. xing,

Thank you very much for your performance improvement request.

We agree with your analysis fully.

We also like the patch that you have provided for us. We thank you very much for it ...

This report is now verified.
[4 Apr 2023 9:54] Marcin Babij
Hello Alex.
Thank you for the report and patch.
We are trying to find a workload during which it would help.
Either there are some write transactions, and this page is hot and always in the head of LRU and will not be evicted.
Or the load is a massive amount of readonly queries with only a transaction or so per second that would be a write and a very small BufferPool, so it is fully evicted before 256 write transactions happen.
Even in the second case, the latency of one write transactions per 256 being just a little higher is insignificant.

On the other hand, every single transaction, and possibly in multiple places during each, we would be checking for the page presence in the BP, getting and holding S-latch on the PageHash, which will incur both CPU, cache sharing and latency pressure on all loads with higher amount of write transactions.

Please explain why and when this patch is beneficial.