Description:
MySQL Router currently doesn't check for Secondary Replication Lag when deployed in InnoDB ReplicaSet. Router will route requests from read port to a stale replica, if it's available. That's unfortunate, as it means that clients should expect to read arbitrarily stale data, sometimes hours old.
Router provides a few routing strategies right now, none of which is concerned with the replication lag: https://dev.mysql.com/doc/mysql-router/8.0/en/mysql-router-conf-options.html#option_mysqlr...
MySQL Shell's ReplicaSet status() command will show replication lag if it's present, so Shell is aware of the lag.
There also doesn't seem to exist an option to manually fence off a Secondary, which is available but delayed, short of maybe blocking network access between the node and the router. As Router is deployed in ReplicaSet, the metadata-cache://<cluster> connection provides list of nodes, so editing router config manually is not an option, like it would be for a basic routing scenario.
How to repeat:
Set up a basic replicaset environment.
dba.deploySandboxInstance(3410)
dba.deploySandboxInstance(3420)
dba.deploySandboxInstance(3430)
\c root@localhost:3410
rs = dba.createReplicaSet('test')
rs.addInstance('root@localhost:3420')
rs.addInstance('root@localhost:3430')
Bootstrap and start a router.
$ mysqlrouter --bootstrap root@localhost:3410 --directory router3410
$ ./router3410/start.sh
Create heavy enough write workload on Primary.
$ sysbench /usr/share/sysbench/oltp_read_write.lua --db-driver=mysql --auto_inc=off \
--tables=4 --table_size=100000 --threads=4 --report-interval=1 \
--mysql-host=127.0.0.1 --mysql-port=6446 --mysql-user=sbuser \
--mysql-password='password' --mysql-db=sysbench \
prepare
$ sysbench /usr/share/sysbench/oltp_read_write.lua --db-driver=mysql --db-ps-mode=disable \
--skip_trx=on --mysql-user=sbuser --mysql-password='password' --mysql-db=sysbench \
--mysql-host=127.0.0.1 --mysql-port=6446 --mysql-ignore-errors=all --tables=4 \
--table_size=100000 --report-interval=1 --threads=4 --time=0 --events=0 --rate=20 \
run | grep tps
Observe Secondaries.
$ cat query-sleep.sh
#!/bin/bash
while (true); do
echo -n "$(date +%F\ %H:%M:%S.%N): " && mysql -h127.0.0.1 -P6447 -uroot -BNe "SELECT @@report_port, @@GLOBAL.gtid_executed;"
echo -n "$(date +%F\ %H:%M:%S.%N): " && mysql -h127.0.0.1 -P6447 -uroot -BNe "SELECT @@report_port, @@GLOBAL.gtid_executed;"
sleep 5
done
$ ./query-sleep.sh
...
2020-06-15 09:02:55.051426655: 3430 750eee2e-aef3-11ea-9e94-90b11c553129:1-181778
2020-06-15 09:02:55.084802793: 3410 750eee2e-aef3-11ea-9e94-90b11c553129:1-181778
...
Shut down one of the Secondaries.
Wait for a few minutes.
Start that Secondary back.
Observe clients hitting a now-stale Secondary.
$ ./query-sleep.sh
...
2020-06-15 09:14:42.710042718: 3430 750eee2e-aef3-11ea-9e94-90b11c553129:1-226173
2020-06-15 09:14:42.735640228: 3430 750eee2e-aef3-11ea-9e94-90b11c553129:1-226173
2020-06-15 09:14:47.765533176: 3410 750eee2e-aef3-11ea-9e94-90b11c553129:1-183625 <<<
2020-06-15 09:14:47.815480555: 3430 750eee2e-aef3-11ea-9e94-90b11c553129:1-226574
2020-06-15 09:14:52.852841012: 3410 750eee2e-aef3-11ea-9e94-90b11c553129:1-186512 <<<
2020-06-15 09:14:52.884792873: 3430 750eee2e-aef3-11ea-9e94-90b11c553129:1-226955
...
Check ReplicaSet status() output to further confirm the node is lagging.
"127.0.0.1:3410": {
"address": "127.0.0.1:3410",
"instanceRole": "SECONDARY",
"mode": "R/O",
"replication": {
"applierQueuedTransactionSetSize": 27925,
"applierStatus": "APPLYING",
"applierThreadState": "waiting for handler commit",
"receiverStatus": "ON",
"receiverThreadState": "Waiting for master to send event",
"replicationLag": "00:07:27.713819"
},
"status": "ONLINE"
}
Suggested fix:
Add a new routing strategy which would allow to treat a node lagging for more than N seconds as unreachable.
and/or
Provide a manual way to prevent the Router from routing connections to a specific node when using metadata-cache as a node list provider. This fence can be then used to prevent reads from stale nodes.