Bug #114331 | MySQL get high insert latency in low concurrency | ||
---|---|---|---|
Submitted: | 12 Mar 2024 21:53 | Modified: | 20 Sep 2024 8:18 |
Reporter: | Zongzhi Chen (OCA) | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S5 (Performance) |
Version: | 8.0.*, 8.0.36 | OS: | Linux |
Assigned to: | CPU Architecture: | x86 | |
Tags: | performance, REDO Log |
[12 Mar 2024 21:53]
Zongzhi Chen
[13 Mar 2024 8:32]
Zongzhi Chen
update the benchmark result from sysbench test result The 95% latency is threads = 1 is 140us, in threads = 2 is 120us, it's about 10% regression..
[14 Mar 2024 11:40]
MySQL Verification Team
Hello Zongzhi Chen, Thank you for the report and feedback. Could you please share the MySQL Server config file(if not on default), OS details(RAM/CPU etc., file system) used for the tests? Thank you. regards, Umesh
[14 Mar 2024 21:22]
Zongzhi Chen
The os is linux kernel 4.19, the filesystem is ext4. And below is my server config file # general max_connections=4000 table_open_cache=8000 table_open_cache_instances=16 back_log=1500 default_password_lifetime=0 ssl=0 performance_schema=OFF max_prepared_stmt_count=512000 character_set_server=latin1 collation_server=latin1_swedish_ci transaction_isolation=REPEATABLE-READ # files innodb_file_per_table innodb_log_file_size=1024M innodb_log_files_in_group=3 innodb_open_files=4000 # buffers innodb_buffer_pool_size=32000M innodb_buffer_pool_instances=16 innodb_log_buffer_size=64M # tune innodb_doublewrite=1 innodb_thread_concurrency=0 innodb_flush_log_at_trx_commit=1 innodb_max_dirty_pages_pct=90 innodb_max_dirty_pages_pct_lwm=10 join_buffer_size=32K sort_buffer_size=32K innodb_use_native_aio=1 innodb_stats_persistent=1 innodb_spin_wait_delay=6 innodb_max_purge_lag_delay=300000 innodb_max_purge_lag=0 #innodb_flush_method=O_DIRECT_NO_FSYNC innodb_checksum_algorithm=crc32 innodb_io_capacity=4000 innodb_io_capacity_max=20000 innodb_lru_scan_depth=9000 innodb_change_buffering=none innodb_read_only=0 innodb_page_cleaners=4 innodb_undo_log_truncate=off # perf special innodb_adaptive_flushing=1 innodb_flush_neighbors=0 innodb_read_io_threads=16 innodb_write_io_threads=16 innodb_purge_threads=4 innodb_adaptive_hash_index=0 # monitoring innodb_monitor_enable='%' performance_schema = OFF innodb_random_read_ahead = ON #wxf added/modify #tls_version="" #innodb_log_recent_closed_size=32M #innodb_log_recent_written_size=16M innodb_buffer_pool_size=128G innodb_buffer_pool_instances=8 loose_innodb_trx_hash_mutex_disabled=OFF #loose_innodb_open_files=65535 #innodb_log_file_split_num=4 #innodb_buffer_pool_size=64M #innodb_buffer_pool_instances=1 #performance_schema=on core-file log_error_verbosity=3 #loose_innodb_polar_log_file_extend_method=0 #ssl=1 performance_schema=OFF innodb_undo_tablespaces=95 #rds_audit_log_enabled=ON #innodb_buffer_pool_size=128M #innodb_buffer_pool_instances=1 #innodb_undo_tablespaces=2 #innodb_read_io_threads=1 #innodb_flush_method=O_DIRECT_NO_FSYNC innodb_buffer_pool_size=512G innodb_buffer_pool_instances=32 innodb_flush_method=O_DIRECT_NO_FSYNC innodb_flush_log_at_trx_commit=0 loose_innodb_polar_blink_tree=OFF loose_innodb_lock_sys_table_partition=64 loose_innodb_polar_copy_page_pool_size=16G loose_innodb_polar_log_auto_purge=ON #performance_schema=ON #loose_innodb_polar_blink_tree=ON performance_schema=OFF #loose_query_cache_type=ON #loose_innodb_csn=ON #loose_innodb_csn_lockfree=ON loose_innodb_lock_sys_rec_partition=128 loose_innodb_trx_sys_opt=ON loose_innodb_log_recent_written_size=32M loose_innodb_log_recent_closed_size=64M loose_innodb_polar_bwtree_enable=ON innodb_buffer_pool_size=32G innodb_buffer_pool_instances=8 #innodb_buffer_pool_size=256M #innodb_undo_tablespaces=2 loose_innodb_polar_bwtree_enable=OFF loose_innodb_polar_blink_tree=ON loose_innodb_polar_log_max_reserved_files_no_slave=8 loose_innodb_random_read_ahead=OFF loose_innodb_read_ahead_threshold=0 loose_innodb_page_cleaners=16 innodb_flush_log_at_trx_commit=1 loose_table_open_cache=500000 loose_innodb_lru_scan_depth=1024 loose_innodb_buffer_pool_size=32G loose_innodb_buffer_pool_instances=256 loose_innodb_page_cleaners=32 loose_innodb_write_io_threads=32 loose_innodb_lru_manager_threads=32 innodb_io_capacity=40000 innodb_io_capacity_max=80000 innodb_purge_threads=64 # loose_innodb_buffer_pool_size=32G # loose_innodb_buffer_pool_instances=32 # loose_innodb_page_cleaners=32 # loose_innodb_write_io_threads=32 # loose_wide_primary_fast_lookup=ON loose_eq_range_use_index_stats=ON # innodb_page_size=4k loose_innodb_doublewrite=0 loose_performance_schema=ON server-id=1 skip-log-bin
[19 Mar 2024 9:35]
MySQL Verification Team
Hello chen zongzhi, Thank you for the report and feedback. regards, Umesh
[19 Mar 2024 9:41]
MySQL Verification Team
-- hostnamectl Static hostname: support-cluster03 Icon name: computer-vm Chassis: vm Machine ID: 9c6c496a290d4ffba5b26c3cd0c3724c Boot ID: 12fc6a96abde4caf9f2eb13809e48cc8 Virtualization: kvm Operating System: Oracle Linux Server 7.9 CPE OS Name: cpe:/o:oracle:linux:7:9:server Kernel: Linux 5.4.17-2102.206.1.el7uek.x86_64 Architecture: x86-64 rm -rf /dev/shm/ushastry/14331/ bin/mysqld --defaults-file=my14331.cnf --initialize-insecure --basedir=$PWD --datadir=/dev/shm/ushastry/14331 --log-error-verbosity=3 bin/mysqld_safe --defaults-file=my14331.cnf --mysqld-version='' --basedir=$PWD --datadir=/dev/shm/ushastry/14331/ --core-file --socket=/tmp/mysql.sock --port=3306 --log-error=/dev/shm/ushastry/14331/log.err --mysqlx=0 --log-error-verbosity=3 --secure-file-priv="" --local-infile=1 2>&1 & bin/mysql -uroot -S /tmp/mysql.sock Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 8 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create database sbtest; Query OK, 1 row affected (0.00 sec) ### thread 1 bin/sysbench ./share/sysbench/oltp_insert.lua --mysql-host=127.0.0.1 --mysql-user=root --tables=1 --table_size=100 --threads=1 --report-interval=1 --rand-type=uniform prepare sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) Creating table 'sbtest1'... Inserting 100 records into 'sbtest1' Creating a secondary index on 'sbtest1'... bin/sysbench ./share/sysbench/oltp_insert.lua --mysql-host=127.0.0.1 --mysql-user=root --tables=1 --table_size=100 --threads=1 --time=12000 --report-interval=1 --rand-type=uniform --mysql-ignore-errors=all run sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 1 Report intermediate results every 1 second(s) Initializing random number generator from current time Initializing worker threads... Threads started! [ 1s ] thds: 1 tps: 10294.74 qps: 10294.74 (r/w/o: 0.00/10294.74/0.00) lat (ms,95%): 0.11 err/s: 0.00 reconn/s: 0.00 [ 2s ] thds: 1 tps: 12432.53 qps: 12432.53 (r/w/o: 0.00/12432.53/0.00) lat (ms,95%): 0.10 err/s: 0.00 reconn/s: 0.00 [ 3s ] thds: 1 tps: 13348.02 qps: 13348.02 (r/w/o: 0.00/13348.02/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 [ 4s ] thds: 1 tps: 13241.85 qps: 13241.85 (r/w/o: 0.00/13241.85/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 [ 5s ] thds: 1 tps: 13422.93 qps: 13422.93 (r/w/o: 0.00/13422.93/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 [ 6s ] thds: 1 tps: 12995.03 qps: 12995.03 (r/w/o: 0.00/12995.03/0.00) lat (ms,95%): 0.10 err/s: 0.00 reconn/s: 0.00 [ 7s ] thds: 1 tps: 13305.18 qps: 13305.18 (r/w/o: 0.00/13305.18/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 [ 8s ] thds: 1 tps: 13115.74 qps: 13115.74 (r/w/o: 0.00/13115.74/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 [ 9s ] thds: 1 tps: 13010.42 qps: 13010.42 (r/w/o: 0.00/13010.42/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 [ 10s ] thds: 1 tps: 12980.74 qps: 12980.74 (r/w/o: 0.00/12980.74/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 [ 11s ] thds: 1 tps: 12931.98 qps: 12931.98 (r/w/o: 0.00/12931.98/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 [ 12s ] thds: 1 tps: 13006.29 qps: 13006.29 (r/w/o: 0.00/13006.29/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 [ 13s ] thds: 1 tps: 12991.93 qps: 12991.93 (r/w/o: 0.00/12991.93/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 [ 14s ] thds: 1 tps: 12938.04 qps: 12938.04 (r/w/o: 0.00/12938.04/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 [ 15s ] thds: 1 tps: 12932.89 qps: 12932.89 (r/w/o: 0.00/12932.89/0.00) lat (ms,95%): 0.09 err/s: 0.00 reconn/s: 0.00 ### thread 2 bin/sysbench ./share/sysbench/oltp_insert.lua --mysql-host=127.0.0.1 --mysql-user=root --tables=1 --table_size=100 --threads=1 --report-interval=1 --rand-type=uniform prepare sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) Creating table 'sbtest1'... Inserting 100 records into 'sbtest1' Creating a secondary index on 'sbtest1'... bin/sysbench ./share/sysbench/oltp_insert.lua --mysql-host=127.0.0.1 --mysql-user=root --tables=1 --table_size=100 --threads=2 --time=12000 --report-interval=1 --rand-type=uniform --mysql-ignore-errors=all run sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 2 Report intermediate results every 1 second(s) Initializing random number generator from current time Initializing worker threads... Threads started! [ 1s ] thds: 2 tps: 18187.13 qps: 18187.13 (r/w/o: 0.00/18187.13/0.00) lat (ms,95%): 0.14 err/s: 0.00 reconn/s: 0.00 [ 2s ] thds: 2 tps: 21062.53 qps: 21062.53 (r/w/o: 0.00/21062.53/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00 [ 3s ] thds: 2 tps: 24334.58 qps: 24334.58 (r/w/o: 0.00/24334.58/0.00) lat (ms,95%): 0.10 err/s: 0.00 reconn/s: 0.00 [ 4s ] thds: 2 tps: 23482.48 qps: 23482.48 (r/w/o: 0.00/23482.48/0.00) lat (ms,95%): 0.11 err/s: 0.00 reconn/s: 0.00 [ 5s ] thds: 2 tps: 23432.05 qps: 23432.05 (r/w/o: 0.00/23432.05/0.00) lat (ms,95%): 0.11 err/s: 0.00 reconn/s: 0.00 [ 6s ] thds: 2 tps: 20356.19 qps: 20356.19 (r/w/o: 0.00/20356.19/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00 [ 7s ] thds: 2 tps: 23677.81 qps: 23677.81 (r/w/o: 0.00/23677.81/0.00) lat (ms,95%): 0.10 err/s: 0.00 reconn/s: 0.00 [ 8s ] thds: 2 tps: 23776.14 qps: 23776.14 (r/w/o: 0.00/23776.14/0.00) lat (ms,95%): 0.10 err/s: 0.00 reconn/s: 0.00 [ 9s ] thds: 2 tps: 19075.24 qps: 19075.24 (r/w/o: 0.00/19075.24/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00 [ 10s ] thds: 2 tps: 22349.89 qps: 22349.89 (r/w/o: 0.00/22349.89/0.00) lat (ms,95%): 0.11 err/s: 0.00 reconn/s: 0.00 [ 11s ] thds: 2 tps: 20580.07 qps: 20580.07 (r/w/o: 0.00/20580.07/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00 [ 12s ] thds: 2 tps: 22338.79 qps: 22338.79 (r/w/o: 0.00/22338.79/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00 [ 13s ] thds: 2 tps: 22318.07 qps: 22318.07 (r/w/o: 0.00/22318.07/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00 [ 14s ] thds: 2 tps: 20605.99 qps: 20605.99 (r/w/o: 0.00/20605.99/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00 [ 15s ] thds: 2 tps: 19520.43 qps: 19520.43 (r/w/o: 0.00/19520.43/0.00) lat (ms,95%): 0.12 err/s: 0.00 reconn/s: 0.00
[28 May 2024 12:47]
Jakub Lopuszanski
Posted by developer: Hello, I feel a bit confused. I am using following awk script to aggregate together all the per-second reports from single sysbench run into one total: ``` $ awk '/thds/ {thds=$5; tps=$7; txs+=tps; seconds++} END {printf("thds=%d duration=%ds tps=%d latency=%dus\n", thds, seconds, txs/seconds, 1000000*seconds*thds/txs)}' ``` this is because we can see huge variance from second to second as the test runs, so we need to look at a summary to get anything statistically significant. The reporter saw that latency for threads=1 is larger than for threads=2. Indeed their sysbench report can be summarized as: thds=1 duration=8s tps=10091 latency=99us thds=2 duration=6s tps=22462 latency=89us Matching the claims of the reporter: > The average latency is threads = 1 is 99us, in threads = 2 is 88us, it's about 10% regression.. What I feel is missing from the report, though, is the number of CPUs on which mysqld was executing. This is relevant, because the formulas used in redo log writer take number of CPUs into account. But then, Umesh Shastry, when trying to reproduce this issue on 19-Mar-2024 09:41:33, sees a different situation: run on 8.0.36: thds=1 duration=15s tps=12863 latency=77us thds=2 duration=15s tps=21673 latency=92us run on 8.0.11: thds=1 duration=27s tps=13346 latency=74us thds=2 duration=25s tps=24910 latency=80us Also, first few seconds in runs for 8.0.11 by Umesh seem to be very slow, but even if I exclude them, I get: thds=1 duration=20s tps=14172 latency=70us thds=2 duration=21s tps=25509 latency=78us To me it looks like we are unable to reproduce the issue. I would like to know: - what was the number of CPUs used by reporter - what was the number of CPUs used by Umesh - can we reproduce it or not?
[20 Aug 2024 8:18]
MySQL Verification Team
Hello Zongzhi Chen, Thank you for the report and feedback. It seems that I made a mistake while verifying as developers concluded that my verification results were opposite to the what you have claimed in the report. Could you please respond to developer's comment? We are interested to know: - What was the number of CPUs used by you? - If possible, could you please also share the numbers with the suggested fix you provided? Thank you. regards, Umesh
[21 Sep 2024 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".