Bug #95934 innodb_flush_log_at_trx_commit = 0 get performance regression on 8 core machine
Submitted: 23 Jun 2019 17:13 Modified: 5 Jul 2019 8:17
Reporter: chen zongzhi (OCA) Email Updates:
Status: Analyzing Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.0.* OS:Any
Assigned to: CPU Architecture:Any

[23 Jun 2019 17:13] chen zongzhi
Description:
In MySQL8.0.16, innodb_flush_log_at_trx_commit = 1 has a better performance than innodb_flush_log_at_trx_commit = 0 on 8 core environment.  This is really strange.

This is the log from the sysbench oltp_write_only case

[ 1350s ] thds: 512 tps: 3097.59 qps: 18586.74 (r/w/o: 0.00/12391.56/6195.18) lat (ms,95%): 530.08 err/s: 0.00 reconn/s: 0.00
[ 1360s ] thds: 512 tps: 2692.71 qps: 16162.08 (r/w/o: 0.00/10776.75/5385.33) lat (ms,95%): 549.52 err/s: 0.00 reconn/s: 0.00
[ 1370s ] thds: 512 tps: 3038.29 qps: 18223.25 (r/w/o: 0.00/12146.57/6076.68) lat (ms,95%): 458.96 err/s: 0.00 reconn/s: 0.00
[ 1380s ] thds: 512 tps: 3254.50 qps: 19521.99 (r/w/o: 0.00/13012.99/6509.00) lat (ms,95%): 450.77 err/s: 0.00 reconn/s: 0.00
[ 1390s ] thds: 512 tps: 3054.39 qps: 18325.65 (r/w/o: 0.00/12216.87/6108.78) lat (ms,95%): 539.71 err/s: 0.00 reconn/s: 0.00
[ 1400s ] thds: 512 tps: 3115.11 qps: 18689.79 (r/w/o: 0.00/12459.56/6230.23) lat (ms,95%): 484.44 err/s: 0.00 reconn/s: 0.00
[ 1410s ] thds: 512 tps: 2930.10 qps: 17586.11 (r/w/o: 0.00/11725.91/5860.20) lat (ms,95%): 520.62 err/s: 0.10 reconn/s: 0.00
[ 1420s ] thds: 512 tps: 2869.80 qps: 17217.22 (r/w/o: 0.00/11477.51/5739.71) lat (ms,95%): 549.52 err/s: 0.00 reconn/s: 0.00

This is the time we set the innodb_flush_log_at_trx_commit from 0 to 1.

[ 1430s ] thds: 512 tps: 3991.46 qps: 24055.68 (r/w/o: 0.00/16073.35/7982.33) lat (ms,95%): 484.44 err/s: 0.00 reconn/s: 0.00
[ 1440s ] thds: 512 tps: 11285.96 qps: 67733.48 (r/w/o: 0.00/45161.05/22572.43) lat (ms,95%): 78.60 err/s: 0.00 reconn/s: 0.00
[ 1450s ] thds: 512 tps: 11734.96 qps: 70404.68 (r/w/o: 0.00/46935.15/23469.53) lat (ms,95%): 77.19 err/s: 0.00 reconn/s: 0.00
[ 1460s ] thds: 512 tps: 11563.32 qps: 69378.01 (r/w/o: 0.00/46251.17/23126.84) lat (ms,95%): 80.03 err/s: 0.00 reconn/s: 0.00

We find that when the system is busy, I find that the recent_write buffer, however the log_writer thread have no chance to get the cpu time. It can't consume the data in recent_write buffer, and even the user thread get the cpu time, it can't write the data to recent_writter buffer since it is full. It seem the system get into a "deadlock"...

How to repeat:
Running the oltp_write_only sysbench 

Suggested fix:
I would suggest to give the log_writer thread a high priority..
[24 Jun 2019 13:02] Sinisa Milivojevic
Hello Mr. Zongzhi,

Thank you for your bug report, which is actually pointing to the performance regression.

Can you let us know how exactly did you run sysbench .... meaning all the arguments that you have used. Separate for the `prepare` and `run`.  Also, I do hope that I would manage to observe the same phenomena on my 6-core i7.

Last, but not least, can you provide us with a configuration for your server ???
[24 Jun 2019 15:54] chen zongzhi
Yes, below is the sysbench command I used, and I always use the configure the same is dimitric in cpu-bound test case

build/bin/sysbench oltp_write_only --mysql-host=xxxxxxxxxx --mysql-port=2250 --mysql-password=xxxxxxxx --mysql-user=replicator --tables=48 --table_size=500000 --threads=512 --report-interval=10 --rand-type=uniform prepare
build/bin/sysbench oltp_write_only --mysql-host=xxxxxxxxxx --mysql-port=2250 --mysql-password=xxxxxxxx --mysql-user=replicator --tables=48 --table_size=500000 --threads=512  --max-time=1800 --report-interval=10 --rand-type=uniform run
[25 Jun 2019 16:11] Sinisa Milivojevic
Hi,

I have tested your sysbench performance measurement with flush at trx commit both set to 0 and then to 1.

I have gathered all statistics and the average number of transactions per second and queries per second is practically the same. Difference is less than 1 %. Hence, it is inconclusive .....

I used my machine with 6 core i9, with SSD, and as I wrote, I have got exactly the same performance with both values for that system variable.

Hence, seems that your machine is even more powerful than mine.

Therefore, it seems to me that your recommendation of increasing thread priority can not be recommended for all setups. Hence, this looks to me just as a tuning tip for the high end machine under very large load.

Would you agree with me or would you like to expand on your recommendation on the manner in which our server would self-tune itself ......
[2 Jul 2019 20:45] chen zongzhi
Actually, I am simulate 8 core machine by using taskset the mysql pid to 8 cpus, and running the sysbench. This is very general way to sell 8 core machine in cloud environment.
Could you try to running the sysbench with taskset or using cgroup to limit the cpu.
[3 Jul 2019 13:01] Sinisa Milivojevic
Hi,

I truly do not understand what is it that you want me to do ...

You want me to test MySQL in cloud .... I am not connected to any cloud .....

Also, as I wrote, I tested MySQL on my iMac with i9 with 6 cores with SSD ....... 

I consider this good enough.
[3 Jul 2019 13:45] XIAOFEI WU
I got the similar result with @chen zongzhi. @Sinisa Milivojevic, can you show me the config files and sysbench comands you used? Thank you!
[3 Jul 2019 13:48] chen zongzhi
you mean you running sysbench in your iMac in MacOS?  not in the linux?
can you running the sysbench in linux, I think most of the MySQL user running in linux..
[4 Jul 2019 12:31] Sinisa Milivojevic
Hi,

I have forwarded your request to the group that is dealing with this matter. They will schedule the testing and when done, they will report back.
[5 Jul 2019 8:17] chen zongzhi
Thank you.
Waiting for your reply