Description:
Hello, dear verification team:
Here I find a case that transactions may commit slow. If we have already writen many binlog files (e.g. 20GB binlog files), and we now change configuration `binlog_expire_logs_seconds` as 1 dynamically, when the next binlog rotation happens, the transaction which is doing purge will waste too much time on purging expired binlog files, which I think is an issue of performance and should be improved.
With 28GB binlog files on NVME disk to purge, a 6.317686 sec slow query log is produced.
Please reference to `How to repeat` for detail.
How to repeat:
(1) turn on `slow_query_log` and modify `long_query_time` as `0.5`
(2) produce many binlog files with sysbench, up to more than 20GB (I produced binlog files of 28GB size) with oltp.lua of sysbench (very small transactions)
[yxx_sysbench_1@localhost sysbench]$ ./sysbench --test=./db/oltp.lua --oltp-auto-inc=off --oltp-table-size=1000000 --oltp-tables-count=2 --mysql-table-engine=innodb --mysql-user=root --mysql-password='db10$ZTE' --mysql-port=6324 --mysql-host=192.168.10.25 --mysql-db=sbtest --max-requests=0 --max-time=38800 --num-threads=200 --report-interval=60 run
sysbench 0.5: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 200
Report intermediate results every 60 second(s)
Random number generator seed is 0 and will be ignored
Threads started!
[ 60s] threads: 200, tps: 9267.25, reads/s: 129779.24, writes/s: 37078.51, response time: 36.61ms (95%)
[ 120s] threads: 200, tps: 9561.10, reads/s: 133857.07, writes/s: 38241.61, response time: 36.44ms (95%)
[ 180s] threads: 200, tps: 9588.68, reads/s: 134238.90, writes/s: 38354.73, response time: 36.23ms (95%)
[ 240s] threads: 200, tps: 9534.50, reads/s: 133493.20, writes/s: 38144.37, response time: 36.36ms (95%)
[ 300s] threads: 200, tps: 9571.07, reads/s: 133987.23, writes/s: 38277.25, response time: 36.10ms (95%)
[ 360s] threads: 200, tps: 9646.77, reads/s: 135056.62, writes/s: 38585.16, response time: 36.03ms (95%)
[ 420s] threads: 200, tps: 9637.35, reads/s: 134922.53, writes/s: 38550.22, response time: 35.95ms (95%)
[ 480s] threads: 200, tps: 9606.92, reads/s: 134484.78, writes/s: 38427.08, response time: 36.14ms (95%)
[ 540s] threads: 200, tps: 9635.87, reads/s: 134918.59, writes/s: 38547.80, response time: 36.07ms (95%)
[ 600s] threads: 200, tps: 9665.53, reads/s: 135308.70, writes/s: 38657.46, response time: 35.85ms (95%)
[ 660s] threads: 200, tps: 9544.77, reads/s: 133626.30, writes/s: 38179.42, response time: 36.48ms (95%)
[ 720s] threads: 200, tps: 9616.50, reads/s: 134633.91, writes/s: 38465.24, response time: 35.95ms (95%)
[ 780s] threads: 200, tps: 9658.38, reads/s: 135207.50, writes/s: 38635.58, response time: 35.94ms (95%)
[ 840s] threads: 200, tps: 9635.87, reads/s: 134915.54, writes/s: 38543.85, response time: 36.12ms (95%)
[ 900s] threads: 200, tps: 9636.08, reads/s: 134898.17, writes/s: 38542.95, response time: 35.96ms (95%)
^C
Every 2.0s: du -sh ../binlog Fri Aug 13 17:49:03 2021
28G ../binlog
(3) modify `binlog_expire_logs_seconds` as `1` dynamically:
mysql> set @@global.binlog_expire_logs_seconds = 1;
Query OK, 0 rows affected (0.00 sec)
mysql> select NOW();
+---------------------+
| NOW() |
+---------------------+
| 2021-08-13 18:08:55 |
+---------------------+
1 row in set (0.00 sec)
(4) slow query log is produced at the next binlog rotation, which is very slow (6.317686 sec):
# Time: 2021-08-13T18:09:39.961767+08:00
# User@Host: root[root] @ [192.168.10.40] Id: 25
# Query_time: 6.317686 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
SET timestamp=1628849373;
COMMIT;
Suggested fix:
I think we can make a single rotation_thread to do binlog purge, instead of worker threads which are executing sql commands.