Bug #111755 potential risk of innodb recovey
Submitted: 14 Jul 2023 5:58 Modified: 14 Jul 2023 7:16
Reporter: alex xing (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0.33 OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution

[14 Jul 2023 5:58] alex xing
Description:
I have found a potential risk that innodb recovey is running lack of memory, which could cause innodb recovey to fail.

I can't reproduce this problem from 8.0.33 code,so let me analyze it theoretically.
Suppose you have an instance of mysqld with innodb_buffer_pool_instances=1 ,which do innodb recovery.

1. in recv_recovery_begin ,we calculate max_mem,and We want to reserve memory of 256 page here
  ulint max_mem =
      UNIV_PAGE_SIZE * (buf_pool_get_n_pages() -
                        (recv_n_pool_free_frames * srv_buf_pool_instances));

2. When the memory occupied by redo exceeds max_memory, then start to apply.
    if (recv_heap_used() > max_memory) {
      recv_apply_hashed_log_recs(log, false);
    }

   Since 64k redo is read at a time, we assume that the redo memory is: 64k*33
   Since 64k redo is read at once, we assume that it occupies 33 times the size in memory which is: 64k*33,
   So let's say that the maximum amount of memory that can be stored in a bufferpool right now is: 256 - (64/16)*33=124

3. The read ahead logic during apply is:
n_stored(suppose 32) prereads are initiated or n_pend_reads>=128 then wake up aio worker threads

buf_read_recv_pages
{
    for (ulint i = 0; i < n_stored; i++) {
        while (buf_pool->n_pend_reads >= recv_n_pool_free_frames / 2) {
      os_aio_simulated_wake_handler_threads();
      std::this_thread::sleep_for(std::chrono::milliseconds(10));

      count++;

      if (!(count % 1000)) {
        ib::error(ER_IB_MSG_145)
            << "Waited for " << count / 100 << " seconds for "
            << buf_pool->n_pend_reads << " pending reads";
      }
    }
    
    buf_read_page_low()

    }
}
os_aio_simulated_wake_handler_threads();

4. 
4.1:
Suppose there are 93 pages applying back(buf_pool->n_pend_reads==93), and that apply requires additional pages and trying to get a free page:
buf_page_io_complete-->recv_recover_page-->recv_parse_or_apply_log_rec_body-->btr_parse_page_reorganize-->btr_page_reorganize_block-->btr_page_reorganize_low-->buf_block_alloc-->buf_LRU_get_free_block
buf_page_io_complete-->ibuf_merge_or_delete_for_page-->ibuf_bitmap_get_map_page-->buf_page_get_gen-->Buf_fetch<T>::single_page-->buf_read_page_low-->buf_page_init_for_read-->buf_LRU_get_free_block 

4.2:
At this point, the AIO worker thread begins to sleep
and next batch of read ahead begin: Suppose 31 prereads have been initiated, and a problem occurs on the 32nd initiate:
at this time buf_pool->n_pend_reads==124, will not execute os_aio_simulated_wake_handler_threads and no free page for buf_read_page_low.

At this point, mysqld will always be trying to get free page

How to repeat:
I can't reproduce this problem from 8.0.33 code, but  if RECV_SCAN_SIZE is adjusted to 20M or larger, this problem can be easily repeated. 

Suggested fix:
execute os_aio_simulated_wake_handler_threads() in buf_LRU_get_free_block just as below patch
[14 Jul 2023 6:04] alex xing
a simple patch to fix the bug

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: innodb_recovery.patch (text/plain), 723 bytes.

[14 Jul 2023 7:16] MySQL Verification Team
Hello alex xing,

Thank you very much for your patch contribution, we appreciate it!

regards,
Umesh