MySQL Bugs: #110091: optimize for innodb recovery

Bug #110091	optimize for innodb recovery
Submitted:	16 Feb 2023 5:27	Modified:	20 Feb 2023 13:58
Reporter:	alex xing (OCA)	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S5 (Performance)
Version:	8.0.32	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	Contribution

Description:
For the page already in the buffer pool ,no need to search redo hash again during the apply phase of innodb recovery.
Native logic:
1. first check whether page in buffer pool
2. if in buffer pool then do search redo hash(recv_get_rec) and do sync apply
3. if not in buffer pool, do read ahead and asynchronous apply
My optimization point:
For the second point, we can save the recv_addr and pass it to the apply function(recv_recover_page_in_bufferpool)

How to repeat:
just read the code

Suggested fix:
we can save the recv_addr and pass it to the apply function(recv_recover_page_in_bufferpool) just as the below patch.
In fact, recv_recover_page_in_bufferpool and recv_recover_page_func have a lot in common, and a lot of logic can be reused. The code I've given is not very elegant, but it illustrates the idea

a simple patch to describe the optimization (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: optimize.patch (text/plain), 8.95 KiB.

Hello Alex Xing,

Thank you for the report and contribution.

regards,
umesh

Thank you for contributing to MySQL!

Can you share results of performance tests have you performed to measure the benefit of this optimisation?

I ask, because I thought that that the page is already in buffer pool is very rare. In particular our code coverage tools show this code is never executed during our test runs.
```
1121      350837 : static void recv_apply_log_rec(recv_addr_t *recv_addr) {
1122      350837 :   if (recv_addr->state == RECV_DISCARDED) {
1123        1821 :     ut_a(recv_sys->n_addrs > 0);
1124        1821 :     --recv_sys->n_addrs;
1125        1821 :     return;
1126             :   }
1127             : 
1128      349016 :   bool found;
1129      349016 :   const page_id_t page_id(recv_addr->space, recv_addr->page_no);
1130             : 
1131      349016 :   const page_size_t page_size =
1132      349016 :       fil_space_get_page_size(recv_addr->space, &found);
1133             : 
1134      698032 :   if (!found || recv_sys->missing_ids.find(recv_addr->space) !=
1135      349016 :                     recv_sys->missing_ids.end()) {
1136             :     /* Tablespace was discarded or dropped after changes were
1137             :     made to it. Or, we have ignored redo log for this tablespace
1138             :     earlier and somehow it has been found now. We can't apply
1139             :     this redo log out of order. */
1140             : 
1141           0 :     recv_addr->state = RECV_PROCESSED;
1142             : 
1143           0 :     ut_a(recv_sys->n_addrs > 0);
1144           0 :     --recv_sys->n_addrs;
1145             : 
1146             :     /* If the tablespace has been explicitly deleted, we
1147             :     can safely ignore it. */
1148             : 
1149           0 :     if (recv_sys->deleted.find(recv_addr->space) == recv_sys->deleted.end()) {
1150           0 :       recv_sys->missing_ids.insert(recv_addr->space);
1151             :     }
1152             : 
1153      349016 :   } else if (recv_addr->state == RECV_NOT_PROCESSED) {
1154       51470 :     mutex_exit(&recv_sys->mutex);
1155             : 
1156       51470 :     if (buf_page_peek(page_id)) {
1157           0 :       mtr_t mtr;
1158             : 
1159           0 :       mtr_start(&mtr);
1160             : 
1161           0 :       buf_block_t *block;
1162             : 
1163           0 :       block =
1164           0 :           buf_page_get(page_id, page_size, RW_X_LATCH, UT_LOCATION_HERE, &mtr);
1165             : 
1166           0 :       buf_block_dbg_add_level(block, SYNC_NO_ORDER_CHECK);
1167             : 
1168           0 :       recv_recover_page(false, block);
1169             : 
1170           0 :       mtr_commit(&mtr);
1171             : 
1172           0 :     } else {
1173       51470 :       recv_read_in_area(page_id);
1174             :     }
1175             : 
1176       51470 :     mutex_enter(&recv_sys->mutex);
1177             :   }
1178             : }
```
This intuitively makes sense to me (and I even wanted to remove this code some time ago).
The redo log records are processed in batches, and before each batch the BP is empty. 
Processing of a batch starts by iterating over the page_id -> records map, and trying to process each page_id.
If the page is not in BP we use `recv_read_in_area` issue asynchronous reads for this page and neighbouring pages which also have changes to be applied.
The `recv_read_in_area` also adds the "placeholder" for the page to the BP and changes the recv_addr->state of the changes for these pages to `RECV_BEING_READ`.
This means that even though the `buf_page_peek()` would return true for such neighbour, the `if (recv_addr->state == RECV_NOT_PROCESSED)` will not be executed for these neighboring pages,
because they will either be still `RECV_BEING_READ`, or once they are read in the `buf_page_io_complete()` run by io thread will call `recv_recover_page(true, (buf_block_t *)bpage)` which will fully recover that page and further change the state to RECV_BEING_PROCESSED and finally RECV_PROCESSED.

As you see, my current mental model is that this code can not be reached at all, so if you can show a test which demonstrates that it can be executed, I'd be grateful.
Even more so, if you can show a case in which it becomes performance bottleneck.

Thanks for your reply!
 First , how to download the code coverage tools you mentioned , I'd like to study it.
 Second, If the page is read ahead and hasn't started redo apply yet, the patch might works, as you said this scene is rare.
 And I haven't run crash recovery tests yet.But I am working on redo replication lately, and this optimization is readly useful.