Description:
When running a large transaction (single insert > 250MB) on an InnoDB group replication cluster with 3 nodes (virtual maschines on KVM) the replication to secondary nodes is stalled if hitting network bandwidth limits (here 1 Gbit/s).
The large transaction is pending infinite for completion, while primary and secondary nodes show consistent high network bandwidth.
It seems the secondary nodes are trying to replicate the problematic transaction over and over again.
Further transactions are waiting for the pending large transactions to complete.
The large transaction and high bandwidth usage can only be cancelled by stopping the primary node.
While the replication problem exists, the group replication plugin sometimes reports connection problems between the nodes in /var/log/mysql/error.log:
2021-08-17T10:04:07.271338Z 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address server-1:3306 has become unreachable.'
2021-08-17T10:04:09.627989Z 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address server-1:3306 is reachable again.'
2021-08-17T10:04:10.118516Z 0 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Shutting down an outgoing connection. This happens because something might be wrong on a bi-directional connection to node server-1:33061. Please check the connection status to this member'
2021-08-17T10:04:10.119153Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Failure reading from fd=-1 n=18446744073709551615'
2021-08-17T10:04:47.375698Z 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address server-1:3306 has become unreachable.'
2021-08-17T10:04:48.257415Z 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address server-1:3306 is reachable again.'
2021-08-17T10:04:55.623104Z 0 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Shutting down an outgoing connection. This happens because something might be wrong on a bi-directional connection to node server-1:33061. Please check the connection status to this member'
All nodes remain in the cluster as they are only unreachable for a short time (based on state from performance_schema.replication_group_members).
Following adjustments did not solve the problem:
* Increasing group_replication_set_write_concurrency to maximum
* Increasing group_replication_poll_spin_loops to maximum
* Disabling compression by group_replication_compression_threshold=0
* Reducing group_replication_communication_max_message_size to 1 MB
* Increasing group_replication_member_expel_timeout to 300 seconds
Connecting the mysql nodes with a 10 Gbit/s connection or running the virtual machines with mysql on the same KVM host solved the replication problem.
But adding software-based network bandwidth limits for the mysql nodes leads to the same problem, so I expect the problem could also occour on a 10 Gbit network if bandwidth problems exist there.
Based on https://dev.mysql.com/doc/refman/8.0/en/group-replication-limitations.html is there a requirement that a complete transaction has to be replicated within the noted 5-second window or does this only apply for single message fragments?
"If an individual transaction results in message contents which are large enough that the message cannot be copied between group members over the network within a 5-second window, members can be suspected of having failed, and then expelled, just because they are busy processing the transaction."
Is there an option to ensure stable replication in case of bandwidth limititation issues?
How to repeat:
Set up a new InnoDB group replication cluster with 2 or 3 nodes.
Create a database and table for testing:
$ mysql -uroot -proot -e 'create database grtest'
$ mysql -uroot -proot grtest -e 'create table test (id bigint auto_increment, data longblob, primary key (id))'
Generate a small transaction and verify inserted data:
$ mysql -uroot -proot grtest < <(echo "insert into test (data) values ('$(dd if=/dev/urandom iflag=fullblock bs=1M count=1 | base64 | tr -d '\n')')")
$ mysql -B -uroot -proot grtest -e 'select id, length(data) from test'
id length(data)
1 1398104
Connect the mysql nodes with a 1 GBit connection, alternatively add software-based bandwidth limitations, examples:
* using OS tools on mysql nodes: tc qdisc add dev $nic root tbf rate 100mbit limit 10mb burst 20kb
* using virsh on a KVM host: domiftune $domain $domain_interface --live --outbound 10000,10000,10000
Create a large transaction of 400MB:
$ echo "insert into test (data) values ('$(dd if=/dev/urandom iflag=fullblock bs=1M count=300 | base64 | tr -d '\n')');" > sql
$ mysql -uroot -proot grtest < sql
Import will never complete unless bandwidth limitations/issues are removed.
Remove network bandwidth limitation and wait for pending transaction to complete or insert again:
$ echo "insert into test (data) values ('$(dd if=/dev/urandom iflag=fullblock bs=1M count=300 | base64 | tr -d '\n')');" > sql
$ mysql -uroot -proot grtest < sql
$ mysql -B -uroot -proot grtest -e 'select id, length(data) from test'
id length(data)
1 1398104
2 419430400