Bug #75570 | semi-sync replication performance degrades with a high number of threads | ||
---|---|---|---|
Submitted: | 21 Jan 2015 7:52 | Modified: | 27 Apr 2015 8:27 |
Reporter: | Rene' Cannao' | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.6.20 | OS: | Linux |
Assigned to: | CPU Architecture: | Any |
[21 Jan 2015 7:52]
Rene' Cannao'
[21 Jan 2015 7:54]
Rene' Cannao'
output of pt-pmp
Attachment: semisync_3000threads.tar.gz (application/x-gzip, text), 106.04 KiB.
[21 Jan 2015 16:28]
MySQL Verification Team
Hello Rene, Thank you for the report and test case. Could you please provide config files from both master/slave used during benchmarks? Thanks, Umesh
[21 Jan 2015 19:52]
Rene' Cannao'
master config file
Attachment: bug75570_master_config.txt (text/plain), 15.66 KiB.
[21 Jan 2015 19:52]
Rene' Cannao'
slave config file
Attachment: bug75570_slave_config.txt (text/plain), 15.66 KiB.
[21 Jan 2015 20:02]
Rene' Cannao'
Hi Umesh, config files attached. Few notes: - the servers are running Percona Server, but I don't think is relevant in this context ; - the way we hit this bug is not because we were running a high number of threads, but any sudden spike in the application that was leading more threads to be executed in parallel, was also causing mysqld to reduce throughput and in turn the application was opening more connections slowing down mysqld even more: in this way threads_running jumped from few hundreds to thousands in short time; - when such scenarios happen the system is busy doing context switches and throughput decrease drastically: as such, we believe that running spin loops is a better usage of CPU
[22 Jan 2015 4:23]
MySQL Verification Team
Thank you Rene for providing the requested details. Thanks, Umesh
[22 Jan 2015 8:36]
Yoshinori Matsunobu
I encountered similar issue before. Using 5.7 (Loss Less Semisync) improved throughput a lot in our case. You may also try setting sync_binlog=0 and innodb_flush_log_at_trx_commit=0|2 on master.
[6 Feb 2015 11:24]
Libing Song
Hi Rene, It seems the slave's configuration files you uploaded is not correct. In the file I saw: rpl_semi_sync_slave_enabled OFF that means semisync slave plugin is not enabled. Could you please check it?
[16 Feb 2015 13:39]
Rene' Cannao'
Hi Libing, I was running other tests on the same boxes after submitting the bug report, and indeed I had disabled semi-sync . Sorry for this. I am attaching new global variables for both master and slave. The only difference is rpl_semi_sync_slave_enabled on slave. I re-run the test with this change to confirm that the behavior is the same initially reported in the bug. Again, sorry for the wrong variables file
[16 Feb 2015 13:39]
Rene' Cannao'
master config file
Attachment: glovar_master.txt (text/plain), 13.94 KiB.
[16 Feb 2015 13:40]
Rene' Cannao'
slave config
Attachment: glovar_slave.txt (text/plain), 13.95 KiB.
[19 Feb 2015 12:58]
MySQL Verification Team
My apologies for the delay in getting back to you on this. Could you please share exact sysbench stats with semi-sync and with async? Also, could you please provide exact my.cnf from both master/slave(pls make it private after uploading) instead of show global variables? Will join my test details for your reference just in case I'm missing anything(Had to scale down test case to fit test environment( innodb buffer pool size adjusted to 150G instead of 283G, sysbench max time changed to 5min instead of 2hrs in the test case)). Regards, Umesh
[19 Feb 2015 12:59]
MySQL Verification Team
test results
Attachment: 75570_results.txt (text/plain), 6.76 KiB.
[19 Feb 2015 14:14]
MySQL Verification Team
Also, could you please try out official MySQL builds instead of Percona server build and share sysbench results with exact Semi-sync replication setup(conf file, and whatever steps followed later)? Thanks in advance! Regards, Umesh
[19 Feb 2015 15:19]
Rene' Cannao'
sysbench results
Attachment: 75570_sysbench.txt (text/plain), 10.31 KiB.
[19 Feb 2015 15:46]
Rene' Cannao'
Hi Umesh, I attached config files and sysbench results with details. Sorry for not attaching config files from the very beginning, but when I used to work at MySQL Support often config files and global variables were not identical, so I tend to ask and provide global variables and not config files. In fact, in our setup there is no reference to semi-sync in config files: semi-sync is only enabled when the server joins production. With regards to innodb_buffer_pool_size, this should not be relevant. I reproduced the issue with innodb_buffer_pool_size=16G (see attached config files), and I am confident I can reproduced it with an even smaller buffer pool. sysbench max time is not relevant either, and 5 minutes are more than enough. In fact the sysbench results I attached now ran for 5 minutes only. In my initial test I run if for longer time (2 hrs) because I was trying to understand if performance just degrade or if transactions/sec stalls completely in the long run. Something relevant that needs attention is rpl_semi_sync_master_timeout: we increase it to a very high value to not revert to asynchronous replication. With rpl_semi_sync_master_timeout=10000 (the default) performance are worse than without semi-sync, yet a lot better than with a very high value of rpl_semi_sync_master_timeout because threads reverts to asynchronous replication (so no semi-sync anymore). In your results I see that with semi-sync enabled you have a max response time of 1890.85ms , therefore in your test rpl_semi_sync_master_timeout doesn't seem to be an issue. I will try out official MySQL builds instead of Percona server build and share new sysbench results.
[20 Feb 2015 4:52]
MySQL Verification Team
Thank you Rene for the sysbench results and conf files. I agree that global variables etc is enough in most of the cases but the list you sent had other extra variables(Percona server) and had to filter those unavailable variables. We(including dev's) are working on this issue and would keep you posted on this. Also, please do share official MySQL build's results when you are done with sysbench tests.
[20 Feb 2015 11:25]
MySQL Verification Team
Hello Rene, This is just to update you that the bug is verified from our end but we would still be interested in seeing your benchmarks stats on our official MySQL builds. Please share the details once you are done with tests. Regards, Umesh
[9 Mar 2015 10:35]
Venkatesh Duggirala
Post from the Developer: ------------------------ Below attachment is the preview of the patch. Feel free to give it a spin and let us know if you find any issues in your environment. Regards, Venkatesh
[9 Mar 2015 10:37]
Venkatesh Duggirala
MySQL 5.6 Preview Patch
Attachment: bug75570-mysql-5.6-patch.diff (text/x-patch), 9.60 KiB.
[27 Apr 2015 8:27]
David Moss
Thank you for your feedback. This has been fixed in upcoming versions and the following was added to the changelog: When using semi-sync replication performance was degrading when the number of threads increased beyond a certain threshold. To improve performance, now only the thread which is committing is responsible for deleting the active transaction node. All other operations do not touch this active transaction list.
[23 Jun 2015 15:41]
Laurynas Biveinis
commit a6eed3a7fe61effa7f49cc3990adc44fba9c56a4 Author: Venkatesh Duggirala <venkatesh.duggirala@oracle.com> Date: Tue Mar 10 12:32:15 2015 +0530 Bug#20574628: SEMI-SYNC REPLICATION PERFORMANCE DEGRADES WITH A HIGH NUMBER OF THREADS Problem: when semi-sync replication is enabled and when the number of threads running is increasing (going beyond a threshold), performance of the system is going drastically down. Analysis: A transaction in semi-sync replication waits for the ACK inside ReplSemiSyncMaster::commitTrx on a condition variable COND_binlog_send_. In reportReplyBinlog function, when server receives an ACK, it sends condition broadcast to all the threads that are waiting on COND_binlog_send_. Each transaction thread waiting an ACK does the follow (simplified): - hold a lock LOCK_binlog_ - in a loop ( while (is_on()) ) : - compare the ACK position with its own position; - if the ACK is still behind, it waits on a condition variable COND_binlog_send_ - if the ACK is ahead or equal, it exits this loop and proceeds with the next step - unlock the lock LOCK_binlog_ The problem here with this design is that all the threads wake up and while it is likely that only few will return to the application and most of them will go back to wait on the same condition variable. This creates unnecessary context switches and a lot of LLC cache thrashing in multi-cpu systems which reduces the throughput of the system. Fix: The approach splits the *one* condvar used for all threads to wait into *several* condvars. Each of the condvars is associated with a position range. Thence only those that fall threads that are waiting for a given acknowledged range are awaken when the ACK comes back (as opposed to waking up all transactions, regardless whether the ACK affects them or not).
[23 Jun 2015 16:53]
Laurynas Biveinis
commit d17d5f47d9f8f11fb0677cbb7620c2d0bf7ca36c Author: Venkatesh Duggirala <venkatesh.duggirala@oracle.com> Date: Mon Mar 30 13:32:57 2015 +0530 Bug#20574628: SEMI-SYNC REPLICATION PERFORMANCE DEGRADES WITH A HIGH NUMBER OF THREADS Fixing a post push pb2 valgrind failure: ======================================== Analysis: In commitTrx(), when a thread is waiting for the signal on a condition variable (entry->cond) after releasing LOCK_binlog_, it can happen that this 'entry' node in active transaction list deleted by another parallel thread which is doing semi sync off (::disableMaster or ::switch_off or resetMaster). This leads to invalid memory read issue. Fix: Now with the new design, only the thread who is doing commit is responsible for deleting the active transaction node. All other operations (disableMaster, switch_off and resetMaster) will not delete nodes from the active transaction list. If disableMaster and resetMaster finds that there are no nodes in the list, it will delete active_tranxs_ node.
[23 Jun 2015 16:54]
Laurynas Biveinis
commit a0e941947efe90588d86b39b4d3ec347ee9c7597 Author: Venkatesh Duggirala <venkatesh.duggirala@oracle.com> Date: Tue Mar 31 11:18:30 2015 +0530 Bug#20574628: SEMI-SYNC REPLICATION PERFORMANCE DEGRADES WITH A HIGH NUMBER OF THREADS Fixing a post push pb2 valgrind failure: ======================================== Rpl_semi_sync::resetMaster does not have to do active_tranxs_ cleanup
[20 Feb 2016 17:43]
MySQL Verification Team
This was fixed in 5.6.25. https://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-25.html "Replication: When using semisynchronous replication performance was degrading when the number of threads increased beyond a certain threshold. To improve performance, now only the thread which is committing is responsible for deleting the active transaction node. All other operations do not touch this active transaction list. (Bug #75570, Bug #20574628)"