Bug #95122 No need to get log_sys->log_flush_order_mutex during InnoDB crash recovery
Submitted: 25 Apr 2019 7:03 Modified: 25 Apr 2019 13:10
Reporter: Fungo Wang (OCA) Email Updates:
Status: Verified Impact on me:
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.6, 5.7 OS:Any
Assigned to: CPU Architecture:Any
Tags: crash recovery, innodb, log_flush_order_mutex, mutex, oldest_modification

[25 Apr 2019 7:03] Fungo Wang
log_sys->mutex and log_sys->log_flush_order_mutex are normally used in this way:

  -> mutex_enter(log_sys->mutex)
  -> generate lsns
  -> mutex_enter(log_sys->log_flush_order_mutex)
  -> mutex_exit(log_sys->mutex)
  -> mark page dirty and add to flush list
  -> mutex_exit(log_sys->log_flush_order_mutex)

I think the purpose of log_sys->log_flush_order_mutex is to make sure
1. dirty pages are added to flush_list in the order of oldest_modification
2. 1 still can be guaranteed when log_sys->mutex is released during mtr_commit

This can guarantee the oldest dirty page always stay at the tail of flush_list and
contains the oldest_modification lsn, the new dirty pages are always added to the 
head of flush_list.

To guarantee these above, I think there is a still necessary implicit requirement, 
lsn is generated in monotonously increasing manner, it can never go back, and this is
true in normal case.

But in crash recovery scenario, the redo log records are stored in hash table by 
key (page_id, page_no), and then applied page by page. So in this way, we can 
not guarantee dirty pages are ordered by oldest_modification if we just add the
new dirty pages to the head of flush_list, because there could be new dirty pages 
with oldest_modification smaller than current biggest oldest_modification or event 
smallest oldest_modification. And flush_rbt is introduced to guarantee order in 
flush_list in crash recovery. 

So in crash recovery, log_sys->log_flush_order_mutex seems useless, it can not 
guarantee anything.

Check function recv_recover_page_func()
1757#ifndef UNIV_HOTBACKUP
1758        if (modification_to_page) {
1759                ut_a(block);
1761                log_flush_order_mutex_enter();
1762                buf_flush_recv_note_modification(block, start_lsn, end_lsn);
1763                log_flush_order_mutex_exit();
1764        }
1765#endif /* !UNIV_HOTBACKUP */

Maybe we can remove the log_flush_order_mutex_enter() here?

How to repeat:
Read the code.

Suggested fix:
Remove log_flush_order_mutex_enter() in recv_recover_page_func().
[25 Apr 2019 13:10] MySQL Verification Team

Thank you for your bug report and code analysis.

I fully agree with your conclusions.

Verified as reported.