Bug #114331 MySQL get high insert latency in low concurrency
Submitted: 12 Mar 2024 21:53 Modified: 20 Sep 2024 8:18
Reporter: Zongzhi Chen (OCA) Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.0.*, 8.0.36 OS:Linux
Assigned to: CPU Architecture:x86
Tags: performance, REDO Log

[12 Mar 2024 21:53] Zongzhi Chen
Description:
Our user workload like sysbench oltp_insert.
in low concurrency workload there is only one user insert thread, the average insert latency is about 10% higher than high concurrency workload.

after research in-depth, I find the default strategy in redo log cause the high latency.

in the function log_wait_for_write and log_wait_for_flush

```
  if (srv_cpu_usage.utime_abs < srv_log_spin_cpu_abs_lwm || cpu >= hwm) {
    /* Don't spin because either cpu usage is too high or it's
    almost idle so no reason to bother. */
    max_spins = 0;
  }
```

If the cpu is almost idle, then user thread don't spin and direct go to pthread_cond_wait. Then it may need some time to wake up the user thread, this cause the high latency.

I can agree that if cpu usage is too high, then user thread should not spin to waste cpu. However, if the cpu is almost idle, why the user thread should not spin? This may cause simple one thread insert workload get much higher latency and lower QPS

running in low concurrency set threads = 1
sysbench oltp_insert --mysql-host=xxx.xxx.xxx.xx --mysql-port=2255 --mysql-password=xxxxxxxx --mysql-user=replicator --tables=1 --table_size=100 --threads=1 --time=12000 --report-interval=1 --rand-type=uniform --mysql-ignore-errors=all run

[ 2s ] thds: 1 tps: 10071.12 qps: 10071.12 (r/w/o: 0.00/10071.12/0.00) lat (ms,95%): 0.14 err/s: 0.00 reconn/s: 0.00
[ 3s ] thds: 1 tps: 10073.05 qps: 10073.05 (r/w/o: 0.00/10073.05/0.00) lat (ms,95%): 0.14 err/s: 0.00 reconn/s: 0.00
[ 4s ] thds: 1 tps: 10072.00 qps: 10072.00 (r/w/o: 0.00/10072.00/0.00) lat (ms,95%): 0.14 err/s: 0.00 reconn/s: 0.00
[ 5s ] thds: 1 tps: 10087.01 qps: 10087.01 (r/w/o: 0.00/10087.01/0.00) lat (ms,95%): 0.14 err/s: 0.00 reconn/s: 0.00
[ 6s ] thds: 1 tps: 10065.03 qps: 10065.03 (r/w/o: 0.00/10065.03/0.00) lat (ms,95%): 0.14 err/s: 0.00 reconn/s: 0.00
[ 7s ] thds: 1 tps: 10095.01 qps: 10095.01 (r/w/o: 0.00/10095.01/0.00) lat (ms,95%): 0.14 err/s: 0.00 reconn/s: 0.00
[ 8s ] thds: 1 tps: 10131.98 qps: 10131.98 (r/w/o: 0.00/10131.98/0.00) lat (ms,95%): 0.14 err/s: 0.00 reconn/s: 0.00
[ 9s ] thds: 1 tps: 10132.91 qps: 10132.91 (r/w/o: 0.00/10132.91/0.00) lat (ms,95%): 0.14 err/s: 0.00 reconn/s: 0.00

sysbench oltp_insert --mysql-host=xxx.xxx.xxx.xx --mysql-port=2255 --mysql-password=xxxxxxxx --mysql-user=replicator --tables=1 --table_size=100 --threads=2 --time=12000 --report-interval=1 --rand-type=uniform --mysql-ignore-errors=all run

[ 2s ] thds: 2 tps: 22485.40 qps: 22485.40 (r/w/o: 0.00/22485.40/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 3s ] thds: 2 tps: 22592.16 qps: 22592.16 (r/w/o: 0.00/22592.16/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 4s ] thds: 2 tps: 22625.03 qps: 22625.03 (r/w/o: 0.00/22625.03/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 5s ] thds: 2 tps: 22588.90 qps: 22588.90 (r/w/o: 0.00/22588.90/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 6s ] thds: 2 tps: 22149.07 qps: 22149.07 (r/w/o: 0.00/22149.07/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 7s ] thds: 2 tps: 22333.99 qps: 22333.99 (r/w/o: 0.00/22333.99/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00

The average latency is threads = 1 is 99us, in threads = 2 is 88us, it's about 10% regression..

How to repeat:
run sysbench command like below:

running in low concurrency set threads = 1
sysbench oltp_insert --mysql-host=xxx.xxx.xxx.xx --mysql-port=2255 --mysql-password=xxxxxxxx --mysql-user=replicator --tables=1 --table_size=100 --threads=1 --report-interval=1 --rand-type=uniform prepare

sysbench oltp_insert --mysql-host=xxx.xxx.xxx.xx --mysql-port=2255 --mysql-password=xxxxxxxx --mysql-user=replicator --tables=1 --table_size=100 --threads=1 --time=12000 --report-interval=1 --rand-type=uniform --mysql-ignore-errors=all run

running in low concurrency set threads = 2
sysbench oltp_insert --mysql-host=xxx.xxx.xxx.xx --mysql-port=2255 --mysql-password=xxxxxxxx --mysql-user=replicator --tables=1 --table_size=100 --threads=1 --report-interval=1 --rand-type=uniform prepare

sysbench oltp_insert --mysql-host=xxx.xxx.xxx.xx --mysql-port=2255 --mysql-password=xxxxxxxx --mysql-user=replicator --tables=1 --table_size=100 --threads=2 --time=12000 --report-interval=1 --rand-type=uniform --mysql-ignore-errors=all run

Suggested fix:
the user thread can spin if the cpu is almost idle.
set default srv_log_spin_cpu_abs_lwm => 0
[13 Mar 2024 8:32] Zongzhi Chen
update the benchmark result from sysbench test result

The 95% latency is threads = 1 is 140us, in threads = 2 is 120us, it's about 10% regression..
[14 Mar 2024 11:40] MySQL Verification Team
Hello Zongzhi Chen,

Thank you for the report and feedback.
Could you please share the MySQL Server config file(if not on default), OS details(RAM/CPU etc., file system) used for the tests? Thank you.

regards,
Umesh
[14 Mar 2024 21:22] Zongzhi Chen
The os is linux kernel 4.19, the filesystem is ext4.
And below is my server config file

# general
max_connections=4000
table_open_cache=8000
table_open_cache_instances=16
back_log=1500
default_password_lifetime=0
ssl=0
performance_schema=OFF
max_prepared_stmt_count=512000
character_set_server=latin1
collation_server=latin1_swedish_ci
transaction_isolation=REPEATABLE-READ

# files
innodb_file_per_table
innodb_log_file_size=1024M
innodb_log_files_in_group=3
innodb_open_files=4000

# buffers
innodb_buffer_pool_size=32000M
innodb_buffer_pool_instances=16
innodb_log_buffer_size=64M

# tune
innodb_doublewrite=1
innodb_thread_concurrency=0
innodb_flush_log_at_trx_commit=1
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10

join_buffer_size=32K
sort_buffer_size=32K
innodb_use_native_aio=1
innodb_stats_persistent=1
innodb_spin_wait_delay=6

innodb_max_purge_lag_delay=300000
innodb_max_purge_lag=0
#innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_checksum_algorithm=crc32
innodb_io_capacity=4000
innodb_io_capacity_max=20000
innodb_lru_scan_depth=9000
innodb_change_buffering=none
innodb_read_only=0
innodb_page_cleaners=4
innodb_undo_log_truncate=off

# perf special
innodb_adaptive_flushing=1
innodb_flush_neighbors=0
innodb_read_io_threads=16
innodb_write_io_threads=16
innodb_purge_threads=4
innodb_adaptive_hash_index=0

# monitoring
innodb_monitor_enable='%'

performance_schema = OFF
innodb_random_read_ahead = ON

#wxf added/modify
#tls_version=""
#innodb_log_recent_closed_size=32M
#innodb_log_recent_written_size=16M
innodb_buffer_pool_size=128G
innodb_buffer_pool_instances=8
loose_innodb_trx_hash_mutex_disabled=OFF
#loose_innodb_open_files=65535
#innodb_log_file_split_num=4

#innodb_buffer_pool_size=64M
#innodb_buffer_pool_instances=1
#performance_schema=on
core-file
log_error_verbosity=3
#loose_innodb_polar_log_file_extend_method=0
#ssl=1
performance_schema=OFF
innodb_undo_tablespaces=95

#rds_audit_log_enabled=ON

#innodb_buffer_pool_size=128M
#innodb_buffer_pool_instances=1
#innodb_undo_tablespaces=2
#innodb_read_io_threads=1
#innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_buffer_pool_size=512G
innodb_buffer_pool_instances=32
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_flush_log_at_trx_commit=0
loose_innodb_polar_blink_tree=OFF
loose_innodb_lock_sys_table_partition=64
loose_innodb_polar_copy_page_pool_size=16G
loose_innodb_polar_log_auto_purge=ON
#performance_schema=ON

#loose_innodb_polar_blink_tree=ON
performance_schema=OFF
#loose_query_cache_type=ON
#loose_innodb_csn=ON
#loose_innodb_csn_lockfree=ON
loose_innodb_lock_sys_rec_partition=128
loose_innodb_trx_sys_opt=ON
loose_innodb_log_recent_written_size=32M
loose_innodb_log_recent_closed_size=64M
loose_innodb_polar_bwtree_enable=ON
innodb_buffer_pool_size=32G
innodb_buffer_pool_instances=8

#innodb_buffer_pool_size=256M
#innodb_undo_tablespaces=2
loose_innodb_polar_bwtree_enable=OFF
loose_innodb_polar_blink_tree=ON
loose_innodb_polar_log_max_reserved_files_no_slave=8
loose_innodb_random_read_ahead=OFF
loose_innodb_read_ahead_threshold=0
loose_innodb_page_cleaners=16
innodb_flush_log_at_trx_commit=1

loose_table_open_cache=500000
loose_innodb_lru_scan_depth=1024

loose_innodb_buffer_pool_size=32G
loose_innodb_buffer_pool_instances=256
loose_innodb_page_cleaners=32
loose_innodb_write_io_threads=32

loose_innodb_lru_manager_threads=32

innodb_io_capacity=40000
innodb_io_capacity_max=80000
innodb_purge_threads=64

# loose_innodb_buffer_pool_size=32G
# loose_innodb_buffer_pool_instances=32
# loose_innodb_page_cleaners=32
# loose_innodb_write_io_threads=32
#
loose_wide_primary_fast_lookup=ON
loose_eq_range_use_index_stats=ON

# innodb_page_size=4k
loose_innodb_doublewrite=0

loose_performance_schema=ON
server-id=1

skip-log-bin
[19 Mar 2024 9:35] MySQL Verification Team
Hello chen zongzhi,

Thank you for the report and feedback.

regards,
Umesh
[19 Mar 2024 9:41] MySQL Verification Team
--
 hostnamectl
   Static hostname: support-cluster03
         Icon name: computer-vm
           Chassis: vm
        Machine ID: 9c6c496a290d4ffba5b26c3cd0c3724c
           Boot ID: 12fc6a96abde4caf9f2eb13809e48cc8
    Virtualization: kvm
  Operating System: Oracle Linux Server 7.9
       CPE OS Name: cpe:/o:oracle:linux:7:9:server
            Kernel: Linux 5.4.17-2102.206.1.el7uek.x86_64
      Architecture: x86-64

rm -rf /dev/shm/ushastry/14331/
bin/mysqld --defaults-file=my14331.cnf --initialize-insecure --basedir=$PWD --datadir=/dev/shm/ushastry/14331 --log-error-verbosity=3
bin/mysqld_safe --defaults-file=my14331.cnf --mysqld-version='' --basedir=$PWD --datadir=/dev/shm/ushastry/14331/ --core-file --socket=/tmp/mysql.sock  --port=3306 --log-error=/dev/shm/ushastry/14331/log.err --mysqlx=0 --log-error-verbosity=3  --secure-file-priv="" --local-infile=1  2>&1 &

 bin/mysql -uroot -S /tmp/mysql.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.36 MySQL Community Server - GPL

Copyright (c) 2000, 2024, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create database sbtest;
Query OK, 1 row affected (0.00 sec)

### thread 1

 bin/sysbench ./share/sysbench/oltp_insert.lua --mysql-host=127.0.0.1 --mysql-user=root --tables=1 --table_size=100 --threads=1 --report-interval=1 --rand-type=uniform prepare
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Creating table 'sbtest1'...
Inserting 100 records into 'sbtest1'
Creating a secondary index on 'sbtest1'...

 bin/sysbench ./share/sysbench/oltp_insert.lua --mysql-host=127.0.0.1 --mysql-user=root --tables=1 --table_size=100 --threads=1 --time=12000 --report-interval=1 --rand-type=uniform --mysql-ignore-errors=all run
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 1
Report intermediate results every 1 second(s)
Initializing random number generator from current time

Initializing worker threads...

Threads started!

[ 1s ] thds: 1 tps: 10294.74 qps: 10294.74 (r/w/o: 0.00/10294.74/0.00) lat (ms,95%): 0.11 err/s: 0.00 reconn/s: 0.00
[ 2s ] thds: 1 tps: 12432.53 qps: 12432.53 (r/w/o: 0.00/12432.53/0.00) lat (ms,95%): 0.10 err/s: 0.00 reconn/s: 0.00
[ 3s ] thds: 1 tps: 13348.02 qps: 13348.02 (r/w/o: 0.00/13348.02/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00
[ 4s ] thds: 1 tps: 13241.85 qps: 13241.85 (r/w/o: 0.00/13241.85/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00
[ 5s ] thds: 1 tps: 13422.93 qps: 13422.93 (r/w/o: 0.00/13422.93/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00
[ 6s ] thds: 1 tps: 12995.03 qps: 12995.03 (r/w/o: 0.00/12995.03/0.00) lat (ms,95%): 0.10 err/s: 0.00 reconn/s: 0.00
[ 7s ] thds: 1 tps: 13305.18 qps: 13305.18 (r/w/o: 0.00/13305.18/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00
[ 8s ] thds: 1 tps: 13115.74 qps: 13115.74 (r/w/o: 0.00/13115.74/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00
[ 9s ] thds: 1 tps: 13010.42 qps: 13010.42 (r/w/o: 0.00/13010.42/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 1 tps: 12980.74 qps: 12980.74 (r/w/o: 0.00/12980.74/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00
[ 11s ] thds: 1 tps: 12931.98 qps: 12931.98 (r/w/o: 0.00/12931.98/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00
[ 12s ] thds: 1 tps: 13006.29 qps: 13006.29 (r/w/o: 0.00/13006.29/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00
[ 13s ] thds: 1 tps: 12991.93 qps: 12991.93 (r/w/o: 0.00/12991.93/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00
[ 14s ] thds: 1 tps: 12938.04 qps: 12938.04 (r/w/o: 0.00/12938.04/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00
[ 15s ] thds: 1 tps: 12932.89 qps: 12932.89 (r/w/o: 0.00/12932.89/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00

### thread 2

 bin/sysbench ./share/sysbench/oltp_insert.lua --mysql-host=127.0.0.1 --mysql-user=root --tables=1 --table_size=100 --threads=1 --report-interval=1 --rand-type=uniform prepare
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Creating table 'sbtest1'...
Inserting 100 records into 'sbtest1'
Creating a secondary index on 'sbtest1'...

 bin/sysbench ./share/sysbench/oltp_insert.lua --mysql-host=127.0.0.1 --mysql-user=root --tables=1 --table_size=100  --threads=2 --time=12000 --report-interval=1 --rand-type=uniform --mysql-ignore-errors=all run
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 2
Report intermediate results every 1 second(s)
Initializing random number generator from current time

Initializing worker threads...

Threads started!

[ 1s ] thds: 2 tps: 18187.13 qps: 18187.13 (r/w/o: 0.00/18187.13/0.00) lat (ms,95%): 0.14 err/s: 0.00 reconn/s: 0.00
[ 2s ] thds: 2 tps: 21062.53 qps: 21062.53 (r/w/o: 0.00/21062.53/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 3s ] thds: 2 tps: 24334.58 qps: 24334.58 (r/w/o: 0.00/24334.58/0.00) lat (ms,95%): 0.10 err/s: 0.00 reconn/s: 0.00
[ 4s ] thds: 2 tps: 23482.48 qps: 23482.48 (r/w/o: 0.00/23482.48/0.00) lat (ms,95%): 0.11 err/s: 0.00 reconn/s: 0.00
[ 5s ] thds: 2 tps: 23432.05 qps: 23432.05 (r/w/o: 0.00/23432.05/0.00) lat (ms,95%): 0.11 err/s: 0.00 reconn/s: 0.00
[ 6s ] thds: 2 tps: 20356.19 qps: 20356.19 (r/w/o: 0.00/20356.19/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 7s ] thds: 2 tps: 23677.81 qps: 23677.81 (r/w/o: 0.00/23677.81/0.00) lat (ms,95%): 0.10 err/s: 0.00 reconn/s: 0.00
[ 8s ] thds: 2 tps: 23776.14 qps: 23776.14 (r/w/o: 0.00/23776.14/0.00) lat (ms,95%): 0.10 err/s: 0.00 reconn/s: 0.00
[ 9s ] thds: 2 tps: 19075.24 qps: 19075.24 (r/w/o: 0.00/19075.24/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 2 tps: 22349.89 qps: 22349.89 (r/w/o: 0.00/22349.89/0.00) lat (ms,95%): 0.11 err/s: 0.00 reconn/s: 0.00
[ 11s ] thds: 2 tps: 20580.07 qps: 20580.07 (r/w/o: 0.00/20580.07/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 12s ] thds: 2 tps: 22338.79 qps: 22338.79 (r/w/o: 0.00/22338.79/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 13s ] thds: 2 tps: 22318.07 qps: 22318.07 (r/w/o: 0.00/22318.07/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 14s ] thds: 2 tps: 20605.99 qps: 20605.99 (r/w/o: 0.00/20605.99/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[ 15s ] thds: 2 tps: 19520.43 qps: 19520.43 (r/w/o: 0.00/19520.43/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[28 May 2024 12:47] Jakub Lopuszanski
Posted by developer:
 
Hello, I feel a bit confused.

I am using following awk script to aggregate together all the per-second reports from single sysbench run into one total:
```
$ awk '/thds/ {thds=$5; tps=$7; txs+=tps; seconds++} END {printf("thds=%d duration=%ds tps=%d latency=%dus\n", thds, seconds, txs/seconds, 1000000*seconds*thds/txs)}'
```
this is because we can see huge variance from second to second as the test runs, so we need to look at a summary to get anything statistically significant.

The reporter saw that latency for threads=1 is larger than for threads=2. Indeed their sysbench report can be summarized as:
thds=1 duration=8s tps=10091 latency=99us
thds=2 duration=6s tps=22462 latency=89us
Matching the claims of the reporter:
> The average latency is threads = 1 is 99us, in threads = 2 is 88us, it's about 10% regression..

What I feel is missing from the report, though, is the number of CPUs on which mysqld was executing. 
This is relevant, because the formulas used in redo log writer take number of CPUs into account.

But then, Umesh Shastry, when trying to reproduce this issue on 19-Mar-2024 09:41:33, sees a different situation:
run on 8.0.36:
thds=1 duration=15s tps=12863 latency=77us
thds=2 duration=15s tps=21673 latency=92us
run on 8.0.11:
thds=1 duration=27s tps=13346 latency=74us
thds=2 duration=25s tps=24910 latency=80us

Also, first few seconds in runs for 8.0.11 by Umesh seem to be very slow, but even if I exclude them, I get:
thds=1 duration=20s tps=14172 latency=70us
thds=2 duration=21s tps=25509 latency=78us

To me it looks like we are unable to reproduce the issue.

I would like to know:
- what was the number of CPUs used by reporter
- what was the number of CPUs used by Umesh
- can we reproduce it or not?
[20 Aug 2024 8:18] MySQL Verification Team
Hello Zongzhi Chen,

Thank you for the report and feedback.
It seems that I made a mistake while verifying as developers concluded that my verification results were opposite to the what you have claimed in the report.  Could you please respond to developer's comment?
We are interested to know:

- What was the number of CPUs used by you?
-  If possible, could you please also share the numbers with the suggested fix you provided? Thank you.

regards,
Umesh
[21 Sep 2024 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".