Bug #95390 wake up log_closer background thread in log_wait_for_space_in_log_recent_closed
Submitted: 16 May 9:57 Modified: 16 May 13:20
Reporter: Fungo Wang Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.0.16 OS:Any
Assigned to: CPU Architecture:Any

[16 May 9:57] Fungo Wang
Description:
In the new redo log design in 8.0, there are several background threads for different purpose.

log_closer thread is one of them and used to advance dpa_lsn (log_buffer_dirty_pages_added_up_to_lsn()) in log.recent_closed link buffer.

log_closer thread wait on log.closer_event event if there is nothing to advance, but no body wake up this event.

mtr_commit() works as below:
mtr_t::Command::execute()
  - log_buffer_reserve
  - m_impl->m_log.for_each_block(write_log);
  - log_wait_for_space_in_log_recent_closed()
  - add_dirty_blocks_to_flush_list()
  - log_buffer_close()

Before mtr_commit() add dirty pages to flush list, it need to make sure there is enough space in recent_closed link buffer by calling log_wait_for_space_in_log_recent_closed(), it will wait until satisfied.

void log_wait_for_space_in_log_recent_closed(log_t &log, lsn_t lsn) {
  ut_a(log_lsn_validate(lsn));

  ut_ad(lsn >= log_buffer_dirty_pages_added_up_to_lsn(log));

  uint64_t wait_loops = 0;

  while (!log.recent_closed.has_space(lsn)) {
    ++wait_loops;
    os_thread_sleep(20);
  }

  if (unlikely(wait_loops != 0)) {
    MONITOR_INC_VALUE(MONITOR_LOG_ON_RECENT_CLOSED_WAIT_LOOPS, wait_loops);
  }
}

When mtr_commit is waiting in log_wait_for_space_in_log_recent_closed(), the log_closer thread could also be in waiting state, until timeout is reached.

So I think it's better to proactively wake up log_closer in log_wait_for_space_in_log_recent_closed(), and get performance improvement.

How to repeat:
I did a simple performance test, and observed improvement after waking up.

Wake up by using bellow change:

diff --git a/storage/innobase/log/log0buf.cc b/storage/innobase/log/log0buf.cc
index 371596f3055..e1935e97b82 100644
--- a/storage/innobase/log/log0buf.cc
+++ b/storage/innobase/log/log0buf.cc
@@ -938,6 +938,7 @@ void log_wait_for_space_in_log_recent_closed(log_t &log, lsn_t lsn) {
   uint64_t wait_loops = 0;

   while (!log.recent_closed.has_space(lsn)) {
+    os_event_set(log.closer_event);
     ++wait_loops;
     os_thread_sleep(20);
   }

The performance test is did using a relative small innodb_log_recent_closed_size (16384) configuration. To be able to change this sysvar, ENABLE_EXPERIMENT_SYSVARS should be enabled when buidling mysqld from source. 

I run with sysbench 1.0 using these parameters:
./src/sysbench oltp_insert --threads=64 --report-interval=1 --time=300 --tables=8 --table_size=200000 run

and mysqld with:
innodb_log_recent_closed_size = 16384

During performance test, enable recent_closed_wait innodb metric by `set global innodb_monitor_enable = 'log_on_recent_closed_wait_loops';`
and use `SELECT * FROM INFORMATION_SCHEMA.INNODB_METRICS where name = 'log_on_recent_closed_wait_loops'\G` to see the counter status.

Orginal 
===========

*************************** 1. row mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_METRICS where name = 'log_on_recent_closed_wait_loops'\G
*************************** 1. row ***************************
           NAME: log_on_recent_closed_wait_loops
      SUBSYSTEM: log
          COUNT: 6473910
      MAX_COUNT: 6473910
      MIN_COUNT: NULL
      AVG_COUNT: 107898.5
    COUNT_RESET: 6473910
MAX_COUNT_RESET: 6473910
MIN_COUNT_RESET: NULL
AVG_COUNT_RESET: NULL
   TIME_ENABLED: 2019-05-16 17:05:06
  TIME_DISABLED: NULL
   TIME_ELAPSED: 60
     TIME_RESET: NULL
         STATUS: enabled
           TYPE: counter
        COMMENT: Loops of waits in user threads on space in log.recent_closed
1 row in set (0.00 sec)

sysbench output
[ 102s ] thds: 64 tps: 25380.71 qps: 25380.71 (r/w/o: 0.00/25380.71/0.00) lat (ms,95%): 4.33 err/s: 0.00 reconn/s: 0.00
[ 103s ] thds: 64 tps: 25441.14 qps: 25441.14 (r/w/o: 0.00/25441.14/0.00) lat (ms,95%): 4.33 err/s: 0.00 reconn/s: 0.00
[ 104s ] thds: 64 tps: 25466.92 qps: 25466.92 (r/w/o: 0.00/25466.92/0.00) lat (ms,95%): 4.33 err/s: 0.00 reconn/s: 0.00
[ 105s ] thds: 64 tps: 25175.02 qps: 25176.02 (r/w/o: 0.00/25176.02/0.00) lat (ms,95%): 4.41 err/s: 0.00 reconn/s: 0.00
[ 106s ] thds: 64 tps: 25465.28 qps: 25464.28 (r/w/o: 0.00/25464.28/0.00) lat (ms,95%): 4.33 err/s: 0.00 reconn/s: 0.00
[ 107s ] thds: 64 tps: 25503.30 qps: 25503.30 (r/w/o: 0.00/25503.30/0.00) lat (ms,95%): 4.33 err/s: 0.00 reconn/s: 0.00
[ 108s ] thds: 64 tps: 25373.74 qps: 25373.74 (r/w/o: 0.00/25373.74/0.00) lat (ms,95%): 4.33 err/s: 0.00 reconn/s: 0.00
[ 109s ] thds: 64 tps: 25211.54 qps: 25211.54 (r/w/o: 0.00/25211.54/0.00) lat (ms,95%): 4.33 err/s: 0.00 reconn/s: 0.00

After wake up
=============
mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_METRICS where name = 'log_on_recent_closed_wait_loops'\G
*************************** 1. row ***************************
           NAME: log_on_recent_closed_wait_loops
      SUBSYSTEM: log
          COUNT: 1866566
      MAX_COUNT: 1866566
      MIN_COUNT: NULL
      AVG_COUNT: 31109.433333333334
    COUNT_RESET: 1866566
MAX_COUNT_RESET: 1866566
MIN_COUNT_RESET: NULL
AVG_COUNT_RESET: NULL
   TIME_ENABLED: 2019-05-16 17:16:09
  TIME_DISABLED: NULL
   TIME_ELAPSED: 60
     TIME_RESET: NULL
         STATUS: enabled
           TYPE: counter
        COMMENT: Loops of waits in user threads on space in log.recent_closed
1 row in set (0.00 sec)

sysbench output
[ 117s ] thds: 64 tps: 51308.19 qps: 51308.19 (r/w/o: 0.00/51308.19/0.00) lat (ms,95%): 2.48 err/s: 0.00 reconn/s: 0.00
[ 118s ] thds: 64 tps: 49021.06 qps: 49021.06 (r/w/o: 0.00/49021.06/0.00) lat (ms,95%): 2.52 err/s: 0.00 reconn/s: 0.00
[ 119s ] thds: 64 tps: 47968.44 qps: 47968.44 (r/w/o: 0.00/47968.44/0.00) lat (ms,95%): 2.61 err/s: 0.00 reconn/s: 0.00
[ 120s ] thds: 64 tps: 48050.19 qps: 48050.19 (r/w/o: 0.00/48050.19/0.00) lat (ms,95%): 2.61 err/s: 0.00 reconn/s: 0.00
[ 121s ] thds: 64 tps: 48635.23 qps: 48635.23 (r/w/o: 0.00/48635.23/0.00) lat (ms,95%): 2.52 err/s: 0.00 reconn/s: 0.00
[ 122s ] thds: 64 tps: 49118.43 qps: 49118.43 (r/w/o: 0.00/49118.43/0.00) lat (ms,95%): 2.52 err/s: 0.00 reconn/s: 0.00
[ 123s ] thds: 64 tps: 48318.13 qps: 48318.13 (r/w/o: 0.00/48318.13/0.00) lat (ms,95%): 2.57 err/s: 0.00 reconn/s: 0.00
[ 124s ] thds: 64 tps: 47472.79 qps: 47472.79 (r/w/o: 0.00/47472.79/0.00) lat (ms,95%): 2.81 err/s: 0.00 reconn/s: 0.00

We can see the wait counter is reduced and TPS is improved (almost doubled).

The reason to use a small innodb_log_recent_closed_size (16384), is the default value 2M is big enough, and I didn't observe any log_on_recent_closed_wait_loops in simple test.

Suggested fix:
Wake up log closer in log_wait_for_space_in_log_recent_closed()

diff --git a/storage/innobase/log/log0buf.cc b/storage/innobase/log/log0buf.cc
index 371596f3055..e1935e97b82 100644
--- a/storage/innobase/log/log0buf.cc
+++ b/storage/innobase/log/log0buf.cc
@@ -938,6 +938,7 @@ void log_wait_for_space_in_log_recent_closed(log_t &log, lsn_t lsn) {
   uint64_t wait_loops = 0;

   while (!log.recent_closed.has_space(lsn)) {
+    os_event_set(log.closer_event);
     ++wait_loops;
     os_thread_sleep(20);
   }
[16 May 12:17] Sinisa Milivojevic
HI,

Thank you for your bug report, dealing with the performance of the InnoDB SE.

I have analysed thoroughly and based on the code analysis alone, I must admit that I agree with you.

Verified as reported.
[16 May 13:20] Fungo Wang
Hi Sinisa,
Thanks for the verification.

After show this perf case to my colleague Weixiang, he suggest to add a check for whether log.closer_event is already set by others, instead of blindly waking up every time, to reduce the system spin lock cause by mutex.

diff --git a/storage/innobase/log/log0buf.cc b/storage/innobase/log/log0buf.cc
index 371596f3055..26b5c4fd85c 100644
--- a/storage/innobase/log/log0buf.cc
+++ b/storage/innobase/log/log0buf.cc
@@ -938,6 +938,9 @@ void log_wait_for_space_in_log_recent_closed(log_t &log, lsn_t lsn) {
   uint64_t wait_loops = 0;

   while (!log.recent_closed.has_space(lsn)) {
+    if (!os_event_is_set(log.closer_event)) {
+      os_event_set(log.closer_event);
+    }
     ++wait_loops;
     os_thread_sleep(20);
   }

The perf result exactly shows spin_lock is reduced, and log_wait_for_space_in_log_recent_closed has gone away from the trace.

With patch 1
=========

+  41.56%  mysqld  mysqld                [.] ut_delay(unsigned long)
-   3.57%  mysqld  [kernel.kallsyms]     [k] _spin_lock
   - _spin_lock
      - 43.79% futex_wait_setup
           futex_wait
           do_futex
           sys_futex
         - system_call_fastpath
            - 88.38% __lll_lock_wait
               + 44.06% Stage_manager::enroll_for(Stage_manager::StageID, THD*, mysql_mutex_t*)
               + 34.57% Gtid_state::update_gtids_impl(THD*, bool)
               + 6.26% log_wait_for_space_in_log_recent_closed(log_t&, unsigned long)
               + 6.03% MYSQL_BIN_LOG::change_stage(THD*, Stage_manager::StageID, THD*, mysql_mutex_t*, mysql_mutex_t*)
               + 3.85% sync_array_reserve_cell(sync_array_t*, void*, unsigned long, char const*, unsigned long)
               + 2.54% MYSQL_BIN_LOG::ordered_commit(THD*, bool, bool)
               + 1.15% MYSQL_BIN_LOG::assign_automatic_gtids_to_flush_group(THD*)
               + 0.62% memo_slot_release(mtr_memo_slot_t*)
               + 0.51% sync_array_wait_event(sync_array_t*, sync_cell_t*&)
            + 11.05% pthread_cond_wait@@GLIBC_2.3.2
            + 0.57% pthread_cond_timedwait@@GLIBC_2.3.2
      + 29.37% futex_wake
      + 10.26% task_rq_lock
      + 4.68% try_to_wake_up
      + 3.15% double_lock_balance
      + 2.38% futex_requeue
      + 1.54% start_this_handle
      + 1.29% thread_return
      + 0.56% mod_timer

With patch 2
=========

+  42.96%  mysqld  mysqld                [.] ut_delay(unsigned long)
-   3.25%  mysqld  [kernel.kallsyms]     [k] _spin_lock
   - _spin_lock
      - 38.74% futex_wait_setup
           futex_wait
           do_futex
           sys_futex
         - system_call_fastpath
            - 64.95% __lll_lock_wait
               + 42.58% Gtid_state::update_gtids_impl(THD*, bool)
               + 36.96% Stage_manager::enroll_for(Stage_manager::StageID, THD*, mysql_mutex_t*)
               + 7.68% MYSQL_BIN_LOG::change_stage(THD*, Stage_manager::StageID, THD*, mysql_mutex_t*, mysql_mutex_t*)
               + 6.66% MYSQL_BIN_LOG::ordered_commit(THD*, bool, bool)
               + 2.37% memo_slot_release(mtr_memo_slot_t*)
               + 1.56% sync_array_wait_event(sync_array_t*, sync_cell_t*&)
               + 1.10% MYSQL_BIN_LOG::assign_automatic_gtids_to_flush_group(THD*)
            + 34.36% pthread_cond_wait@@GLIBC_2.3.2
            + 0.69% pthread_cond_timedwait@@GLIBC_2.3.2
      + 32.44% futex_wake
      + 11.13% task_rq_lock
      + 5.29% try_to_wake_up
      + 2.87% double_lock_balance
      + 2.86% futex_requeue
      + 1.53% start_this_handle
      + 1.36% thread_return
      + 0.74% mod_timer
[16 May 13:25] Sinisa Milivojevic
Thank you for the additional code changes.