Bug #90890 CPU overhead for insert benchmark load increased by 2X in 8.0.11
Submitted: 16 May 15:49 Modified: 4 Sep 13:08
Reporter: Mark Callaghan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.0.11 OS:Any
Assigned to: CPU Architecture:Any

[16 May 15:49] Mark Callaghan
Description:

Compared to earlier releases InnoDB in MySQL 8.0.11 uses twice as much CPU to do the single-threaded insert benchmark. This is from a database small enough to fit in the InnoDB buffer pool.

From the results below, check out Mcpu/i for 8.0.11 vs previous releases.

ips = inserts/second during the load
Mcpu/i = normalized CPU/insert = (cpu / ips) * 1M
cpu = average for vmstat us+sy columns

ips     Mcpu/i  cpu     release
34364   1086    37.3    5.6.35  
30395   1175    35.7    5.7.17
27855   1255    34.9    8.0.3
23256   2480    57.7    8.0.11

How to repeat:
My test server uses Ubuntu 16.04 and is the core i5 NUC explained at http://smalldatum.blogspot.com/2017/05/small-servers-for-database-performance.html

The test is the load from the insert benchmark. The insert benchmark is explained at http://smalldatum.blogspot.com/2017/06/the-insert-benchmark.html

The load is done with 10m rows and a command line is:
iibench.py --db_name=ib --rows_per_report=100000 --db_host=127.0.0.1 --db_user=root --db_password=pw --engine=innodb --engine_options= --unique_checks=1 --bulk_load=0 --max_rows=10000000 --table_name=pi1 --setup --num_secondary_indexes=3 --data_length_min=10 --data_length_max=20 --rows_per_commit=100 --inserts_per_second=0 --query_threads=0 --seed=1526401985

iibench.py is at https://github.com/mdcallag/mytools/tree/master/bench/ibench

my.cnf:
[mysqld]
sql_mode=no_engine_substitution,no_unsigned_subtraction
default_tmp_storage_engine=MyISAM
tmpdir=/data/m/my
#query_cache_size=0
#query_cache_type=0
# slow_query_log_file=/data/m/my/slow.log
long_query_time=5
# skip_log_bin
log_bin=/data/m/my/binlogs/bl
max_binlog_size=128M
sync_binlog=0
server_id=77
binlog_format=ROW
datadir=/data/m/my/data

innodb_max_dirty_pages_pct=75
innodb_adaptive_flushing=1
innodb_file_per_table=1
innodb_flush_log_at_trx_commit=2
innodb_flush_method=O_DIRECT
innodb_doublewrite=1
innodb_page_size=16K
innodb_buffer_pool_size=12G

innodb_flush_neighbors=0
innodb_io_capacity=10000
innodb_io_capacity_max=20000
innodb_lru_scan_depth=5000
#loose_innodb_buffer_pool_instances=1

innodb_data_home_dir=/data/m/my/data
innodb_data_file_path=ibdata1:512M:autoextend
innodb_log_group_home_dir=/data/m/my/txlogs
innodb_log_file_size=1G
slow_query_log=1

innodb_purge_threads=1
performance_schema=1
ssl=0
default_authentication_plugin=mysql_native_password

character_set_server=latin1
collation_server=latin1_swedish_ci

[mysqld_safe]
malloc-lib=/usr/lib/x86_64-linux-gnu/libjemalloc.so
[16 May 15:50] Mark Callaghan
From perf, top-N functions from flat profile for 8.0.11

    26.15%  mysqld   mysqld               [.] log_writer
     4.90%  mysqld   mysqld               [.] log_advance_ready_for_write_lsn
     4.40%  mysqld   mysqld               [.] rec_get_offsets_func
     2.64%  mysqld   mysqld               [.] MYSQLparse
     2.32%  mysqld   mysqld               [.] rec_init_offsets
     1.91%  mysqld   mysqld               [.] ut_crc32_hw
     1.90%  mysqld   mysqld               [.] log_writer_write_buffer
     1.79%  mysqld   mysqld               [.] page_cur_search_with_match_bytes
     1.68%  mysqld   libc-2.23.so         [.] __memcpy_avx_unaligned
     1.68%  mysqld   mysqld               [.] btr_cur_search_to_nth_level
     1.51%  mysqld   mysqld               [.] cmp_whole_field
     1.48%  mysqld   mysqld               [.] mem_heap_create_block_func
     1.39%  mysqld   mysqld               [.] Sql_cmd_dml::execute
     1.35%  mysqld   mysqld               [.] page_cur_search_with_match
     1.23%  mysqld   mysqld               [.] ut_delay
     1.18%  mysqld   mysqld               [.] btr_search_check_free_space_in_heap
     1.17%  mysqld   mysqld               [.] buf_page_get_gen
     1.01%  mysqld   mysqld               [.] buf_page_hash_get_low

And the same from 8.0.3

     6.99%  mysqld   mysqld               [.] rec_get_offsets_func
     6.52%  mysqld   mysqld               [.] MYSQLparse
     4.10%  mysqld   mysqld               [.] page_cur_search_with_match
     3.57%  mysqld   mysqld               [.] page_cur_search_with_match_bytes
     3.52%  mysqld   mysqld               [.] rec_init_offsets
     3.05%  mysqld   mysqld               [.] buf_page_get_gen
     2.97%  mysqld   mysqld               [.] btr_cur_search_to_nth_level
     2.88%  mysqld   mysqld               [.] buf_page_hash_get_low
     2.69%  mysqld   libc-2.23.so         [.] __memcpy_avx_unaligned
     2.22%  mysqld   [unknown]            [.] 0000000000000000
     1.98%  mysqld   mysqld               [.] cmp_dtuple_rec_with_match_low
     1.87%  mysqld   mysqld               [.] ut_crc32_hw
     1.75%  mysqld   mysqld               [.] ut_delay
     1.62%  mysqld   libjemalloc.so.1     [.] malloc
     1.53%  mysqld   mysqld               [.] Persister::write_log
     1.49%  mysqld   mysqld               [.] lock_rec_get_first
     1.42%  mysqld   mysqld               [.] cmp_dtuple_rec_with_match_bytes
     1.40%  mysqld   libjemalloc.so.1     [.] free
     1.30%  mysqld   mysqld               [.] page_cur_insert_rec_write_log
     1.28%  mysqld   mysqld               [.] btr_search_guess_on_hash
     1.17%  mysqld   mysqld               [.] page_cur_insert_rec_low
[16 May 15:53] Mark Callaghan
Hierarchical for 8.0.11 doesn't add much value...

    27.13%  mysqld   mysqld               [.] log_writer
            |
            ---log_writer
               std::thread::_Impl<std::_Bind_simple<Runnable (void (*)(log_t*), log_t*)> >::_M_run
               0xb8c80

     5.72%  mysqld   mysqld               [.] log_advance_ready_for_write_lsn
            |
            ---log_advance_ready_for_write_lsn
               |
                --5.34%--log_writer
                          std::thread::_Impl<std::_Bind_simple<Runnable (void (*)(log_t*), log_t*)> >::_M_run
                          0xb8c80
[16 May 15:59] Mark Callaghan
Not a new problem but the single-threaded insert rate continues to drop from 5.6.35 to 5.7.17 to 8.0.3 to 8.0.11
[18 May 16:50] Mark Callaghan
The spin is 400us by default, as long as CPU utilization is between the low and high water marks. Using the default values it is easy to be between these bounds and then spin for low concurrency workloads. But for workloads with more concurrency it is unlikely to be between the bounds using the default values.

I assume one fix will be to learn the average response time for commit to know whether spinning is a good idea. The defaults are:
innodb_log_spin_cpu_abs_lwm     80
innodb_log_spin_cpu_pct_hwm     50
innodb_log_wait_for_flush_spin_hwm      400

Explained at
https://dev.mysql.com/doc/refman/8.0/en/innodb-parameters.html#sysvar_innodb_log_wait_for_...
[21 May 12:27] Sinisa Milivojevic
Hi,

Thank you very much for your detailed report. 

I have tried my own variant of INSERT benchmark and I have got similar results for 8.0.3 and 8.0.11.

Verified as reported.
[31 Aug 13:24] Pawel Olchawa
Posted by developer:
 
We split the problem and solved the CPU overhead as fix for this bug report.

The performance regression will be solved in a new separate bug report.
[4 Sep 13:08] Daniel Price
Posted by developer:
 
commit 419819b956c7751b967bfe5a5decb1090b10b381
Author: PaweĊ‚ Olchawa <pawel.olchawa@oracle.com>
Date:   Fri Aug 31 10:54:04 2018 +0200

    BUG#28062382 CPU OVERHEAD FOR INSERT BENCHMARK LOAD INCREASED BY 2X IN 8.0.11
    
    This patch introduces following mechanism:
    
    1. Whenever we request write or flush of redo during commit of
       transaction, we increment counter of requests.
    
    2. Log writer thread watches the counter and computes average time between
       consecutive requests to write or flush redo log.
    
    3. When the average time between requests to write or flush redo, is higher
       than 100us, we do not use spin-delay in log threads and we wait directly
       on the event with initial timeout equal to 10ms.
    
    The performance regression, which was also revealed in the bug report,
    will be fixed in a new seperate bug report. That's because that solution
    is more complex, we have it ready but prefer to review it carefully and
    push in next window.
[4 Sep 13:25] Daniel Price
Posted by developer:
 
Fixed as of the upcoming 8.0.14 release, and here's the changelog entry:

A counter was added for redo log write and flush requests that occur
during transaction commit. The counter is used by the log writer thread to
compute the average time between consecutive requests. When the average
time is higher than 100 microseconds, log writer threads do not use spin
delay and instead wait on request events with a 10 microsecond timeout
limit.