Description:
When a node joins a group replication cluster, during the recovering phase it can drastically affect the performance of the whole cluster.
It seems caused by flow control.
How to repeat:
Steps to reproduce the issue:
* start a cluster with 4 nodes, using one single writer
* run sysbench for a write intensive workload, for example:
./sysbench --num-threads=16 --max-time=3600 --max-requests=0 --test=./lua/oltp_update_index.lua --report-interval=1 ... run
* after several minutes that sysbench is running, in one of the node (not the writer) run STOP GROUP_REPLICATION
* wait several minutes, and on the same node where GR was stopped run START GROUP_REPLICATION
Result:
Within few seconds, the throughput reported by sysbench will drop drastically until the node is in sync.
Below an example:
$ ./sysbench --num-threads=16 --max-time=300 --max-requests=0 --test=./lua/oltp_update_index.lua --mysql-user=sbtest --mysql-password=sbtest --mysql-host=10.1.2.42 --mysql-port=5717 --oltp-table-size=10000000 --oltp-tables-count=8 --report-interval=3 --oltp-auto-inc=off run
sysbench 1.0 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 16
Report intermediate results every 3 second(s)
Initializing random number generator from current time
Initializing worker threads...
Threads started!
[ 3s] threads: 16 tps: 11366.46 qps: 11366.46 (r/w/o: 0.00/11366.46/0.00) latency: 1.67ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 6s] threads: 16 tps: 11391.58 qps: 11391.58 (r/w/o: 0.00/11391.58/0.00) latency: 1.70ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 9s] threads: 16 tps: 10960.68 qps: 10960.68 (r/w/o: 0.00/10960.68/0.00) latency: 1.70ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 12s] threads: 16 tps: 11365.37 qps: 11365.37 (r/w/o: 0.00/11365.37/0.00) latency: 1.67ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 15s] threads: 16 tps: 11058.31 qps: 11058.31 (r/w/o: 0.00/11058.31/0.00) latency: 1.67ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 18s] threads: 16 tps: 11441.39 qps: 11441.39 (r/w/o: 0.00/11441.39/0.00) latency: 1.64ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 21s] threads: 16 tps: 11300.63 qps: 11300.63 (r/w/o: 0.00/11300.63/0.00) latency: 1.67ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 24s] threads: 16 tps: 11370.56 qps: 11370.56 (r/w/o: 0.00/11370.56/0.00) latency: 1.64ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 27s] threads: 16 tps: 11346.35 qps: 11346.35 (r/w/o: 0.00/11346.35/0.00) latency: 1.67ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 30s] threads: 16 tps: 11055.72 qps: 11055.72 (r/w/o: 0.00/11055.72/0.00) latency: 1.70ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 33s] threads: 16 tps: 11365.68 qps: 11365.68 (r/w/o: 0.00/11365.68/0.00) latency: 1.67ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 36s] threads: 16 tps: 10576.29 qps: 10576.29 (r/w/o: 0.00/10576.29/0.00) latency: 1.64ms (95%) errors/s: 0.00 reconnects/s: 0.00 <== here START GROUP_REPLICATION is executed
[ 39s] threads: 16 tps: 4815.00 qps: 4815.00 (r/w/o: 0.00/4815.00/0.00) latency: 2.39ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 42s] threads: 16 tps: 10383.06 qps: 10383.06 (r/w/o: 0.00/10383.06/0.00) latency: 1.82ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 45s] threads: 16 tps: 9340.62 qps: 9340.62 (r/w/o: 0.00/9340.62/0.00) latency: 1.82ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 48s] threads: 16 tps: 6893.66 qps: 6893.66 (r/w/o: 0.00/6893.66/0.00) latency: 1.86ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 51s] threads: 16 tps: 4464.33 qps: 4464.33 (r/w/o: 0.00/4464.33/0.00) latency: 1.86ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 54s] threads: 16 tps: 2296.67 qps: 2296.67 (r/w/o: 0.00/2296.67/0.00) latency: 1.93ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 57s] threads: 16 tps: 1673.35 qps: 1673.35 (r/w/o: 0.00/1673.35/0.00) latency: 1.89ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 60s] threads: 16 tps: 1229.99 qps: 1229.99 (r/w/o: 0.00/1229.99/0.00) latency: 2.03ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 63s] threads: 16 tps: 1125.00 qps: 1125.00 (r/w/o: 0.00/1125.00/0.00) latency: 2.00ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 66s] threads: 16 tps: 1125.00 qps: 1125.00 (r/w/o: 0.00/1125.00/0.00) latency: 2.07ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 69s] threads: 16 tps: 1125.00 qps: 1125.00 (r/w/o: 0.00/1125.00/0.00) latency: 2.07ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 72s] threads: 16 tps: 1124.99 qps: 1124.99 (r/w/o: 0.00/1124.99/0.00) latency: 2.66ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 75s] threads: 16 tps: 1125.00 qps: 1125.00 (r/w/o: 0.00/1125.00/0.00) latency: 2.52ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 78s] threads: 16 tps: 1125.00 qps: 1125.00 (r/w/o: 0.00/1125.00/0.00) latency: 2.00ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 81s] threads: 16 tps: 1125.00 qps: 1125.00 (r/w/o: 0.00/1125.00/0.00) latency: 2.03ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 84s] threads: 16 tps: 1125.01 qps: 1125.01 (r/w/o: 0.00/1125.01/0.00) latency: 2.11ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 87s] threads: 16 tps: 1124.99 qps: 1124.99 (r/w/o: 0.00/1124.99/0.00) latency: 2.00ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 90s] threads: 16 tps: 1125.01 qps: 1125.01 (r/w/o: 0.00/1125.01/0.00) latency: 2.18ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 93s] threads: 16 tps: 1124.99 qps: 1124.99 (r/w/o: 0.00/1124.99/0.00) latency: 2.35ms (95%) errors/s: 0.00 reconnects/s: 0.00
[ 96s] threads: 16 tps: 1125.00 qps: 1125.00 (r/w/o: 0.00/1125.00/0.00) latency: 2.11ms (95%) errors/s: 0.00 reconnects/s: 0.00
Note:
Running the follow on the writer node prevents the performance drop:
SET GLOBAL group_replication_flow_control_mode=disabled;
Suggested fix:
A node that is in MEMBER_STATE=RECOVERING should not be evaluate in the flow control algorithm.