Bug #69341 semi-sync replication is very slow with many clients
Submitted: 29 May 2013 0:55 Modified: 5 Jul 2013 10:13
Reporter: Yoshinori Matsunobu (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S5 (Performance)
Version:5.6.11 OS:Any
Assigned to: CPU Architecture:Any

[29 May 2013 0:55] Yoshinori Matsunobu
Description:
With semi-sync replication, I got very low throughput when running updates from multiple clients.

mysqlslap --host=<remote_master_host> --auto-generate-sql --number-of-queries=5000 --concurrency=1,10,20,50,100 --auto-generate-sql-load-type=write --engine=innodb

1 client: 1239 inserts/s
10 clients: 767 inserts/s
20 clients: 478 inserts/s
50 clients: 216 inserts/s
100 clients: 90.0 inserts/s

If disabling semisync (SET GLOBAL rpl_semi_sync_master_enabled = 0), I could get more than 10,000 inserts/s.

binlog sender (mysql_binlog_send()) calls ReplSemiSyncMaster::readSlaveReply() and it calls my_net_read() from slave. gdb said my_net_read() was called ~5000 times when updating (and auto-committing) 5000 times from single client. This is expected. When updating from 100 clients, my_net_read() was called 4371 times. I think this is too high, considering InnoDB supports group commit. I assume that mysql_binlog_send() calls before_send_event(), sending events, and calling after_send_event() per each transaction. before_send_event() and after_send_event() needs to get semisync mutex, so it conflicts with clients holding semisync mutex. I think this can be optimized, such as sending as many binlog events (transactions) as possible at once.

How to repeat:
Enable semi-sync replication, and run mysqlslap as described in "Description". Make sure to use InnoDB storage engine.
[29 May 2013 1:09] Yoshinori Matsunobu
semisync mutex is LOCK_binlog_ defined in semisync_master.h, and here is an example stack trace where mysql_binlog_send() was blocked.

#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007f5dd1aee407 in _L_lock_1143 () from /usr/local/fbcode/gcc-4.6.2-glibc-2.13/lib/libpthread.so.0
#2  0x00007f5dd1aee30a in __pthread_mutex_lock (mutex=0x7f5dd0744870) at pthread_mutex_lock.c:101
#3  0x00007f5dd0540ed9 in ReplSemiSyncMaster::updateSyncHeader (this=0x7f5dd0744820, packet=0x7f5b7b970000 "", log_file_name=0x7f5b7c3bbb89 "binary-5611.000003", log_file_pos=247858586, server_id=561111) at semisync_master.cc:912
#4  0x0000000000647183 in Binlog_transmit_delegate::before_send_event (this=0x128c840, thd=<optimized out>, flags=<optimized out>, packet=0x7f5b7c3bb150, log_file=0x7f5b7c3bbb80 "/binlogs/binary-5611.000003", log_pos=247858586) at sql/rpl_handler.cc:348
#5  0x00000000008406d3 in mysql_binlog_send (thd=0x7f5b72821000, log_ident=0x7f5b7c3bc6b0 "", pos=247858586, slave_gtid_executed=0x7f5b7c3bc8c0) at sql/rpl_master.cc:1434
[29 May 2013 11:58] Luis Soares
Hi Yoshinori! Thanks for reporting this and for providing detailed feedback.
[2 Jun 2013 14:14] Libing Song
Analysis
========
I tested mysql-5.6.10, mysql-5.6.11 and mysql-5.6.12(not GA yet).
I found it is very slow on the debug version of mysql-5.6.10 and mysql-5.6.11.
Not only semisync, but also async replication(even without binlog) is slow as
well.

I tested with 50000 inserts on 100 sessions. mysql-5.6.10 and 5.6.11 can handle
200 inserts/s. mysql-5.6.12 debug version has no performance problem. It can
handle 2083 inserts/s.

all release versions have no performance problem. They can handle almost
10000 inserts/s.

Just like Yoshinori said, the binlog group flush queue is very small, on
5.6.10 and 5.6.11 debug versions. Rpl_semi_sync_master_net_waits are about 49000.
But on 5.6.12 debug version, its value is about 12000.

5.6.11 release version has small binlog group flush queue too,
Rpl_semi_sync_master_net_waits is about 44000. But on 5.6.12 release version
it  is about 5000.

I finally found the patch for Bug#16720368 made the good performance on
mysql-5.6.12 debug version. It removed some debug code of innodb through undefining UNIV_SYNC_DEBUG. But the patch doesn't effect binlog flush queue
size. So I need some time to check which patch makes bigger binlog flush queue 
size.

It seems binlog flash queue size doesn't has obvious effect on the performance.
Because wait time is decided by replication time which is fixed.
[4 Jun 2013 4:21] Libing Song
test performance with mysqlslap

Attachment: my_test.sh (application/x-shellscript, text), 5.40 KiB.

[4 Jun 2013 4:25] Libing Song
Test Methods 
============ 
I have a small script to test the performance. It setup servers through a
test case. It is uploaded above.

run the script in mysql-test directory.
it has three options: simple, rpl, semisync.
you can run it like: sh my_test.sh semisync
You may encounter a problem that install_semisync.inc doesn't exist. It exists
in mysql-5.6.12. I will upload it too.

I get release build through "cmake . -DBUILD_CONFIG=mysql_release". 
I get debug build through "./BUILD/compile-pentium-debug-max-no-ndb
[4 Jun 2013 4:27] Libing Song
copy it into mysql-test/include

Attachment: install_semisync.inc (application/octet-stream, text), 1.08 KiB.

[4 Jun 2013 4:30] Libing Song
Hi Yoshinori,
As I could not repeat the bug on release versions. Could you please try to reproduce the bug by using the script?
[4 Jun 2013 5:11] Yoshinori Matsunobu
Libing,

I could repeat the problem by increasing rpl_semi_sync_master_timeout. Default is 10000 (10 seconds). I increased this value since I didn't want to semisync to be disabled. I could get better throughput with default rpl_semi_sync_master_timeout, but at that time semisync was disabled.

# diff -up include/install_semisync.inc.orig include/install_semisync.inc
--- include/install_semisync.inc.orig
+++ include/install_semisync.inc
@@ -17,6 +17,7 @@ if ($value == No such row)
     SET sql_log_bin = 0;
     eval INSTALL PLUGIN rpl_semi_sync_master SONAME '$SEMISYNC_MASTER_PLUGIN';
     SET GLOBAL rpl_semi_sync_master_enabled = 1;
+    SET GLOBAL rpl_semi_sync_master_timeout = 100000000;
     SET sql_log_bin = 1;
 }
 --enable_query_log
[4 Jun 2013 11:51] Libing Song
Hi Yoshinori,

Thanks for your valuable feedback. I verified the bug too.
[7 Jun 2013 0:57] Libing Song
patch for the bug

Attachment: libing-bug69341.diff (text/x-patch), 1.71 KiB.

[7 Jun 2013 0:59] Libing Song
Hi Yoshinori, All,
there is a tentative patch uploaded to this bug report that fixes the reported issue. Feel free to test it. We welcome your feedback on the patch. Thanks!
[5 Jul 2013 10:13] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

Fixed in 5.6+. Documented in the 5.6.13 and 5.7.2 changelogs as follows:

        When rpl_semi_sync_master_timeout was set to an extremely large
        value, semi-synchronous replication became very slow, especially
        when many sessions were working in parallel. It was discovered
        that the code to calculate this timeout was inside the wait loop
        itself, with the result that an increase in the value of
        rpl_semi_sync_master_timeout caused repeated iterations. This
        fix improves the method used to calculate wakeup times, and
        moves it outside of the wait loop, so that it is executed one
        time only.

Closed.
[7 Apr 2016 5:46] Umesh Shastry
Bug #80651 marked as duplicate of this