Description:
Hi,
running FLUSH BINARY LOGS while a session is "Waiting for semi-sync ACK from slave" does not return. And while FLUSH BINARY LOGS is blocked, running SHOW MASTER STATUS also blocks. This could result in a pile-up which will read max connections. See "How to repeat" for details.
Note that killing a blocked FLUSH BINARY LOGS or a blocked SHOW MASTER STATUS does not free a session.
The workaround for this is to disable semi-sync.
I am classifying this as S2 (serious) as this can result in reaching max connections, which will make it very hard to apply the work-around.
Many thanks for looking into that,
Jean-François Gagné
How to repeat:
Note that the instructions below are for 8.0.25, but the same will happen for 5.7.34.
# Create a replication sandbox:
dbdeployer deploy replication mysql_8.0.25 --gtid --semi-sync
# Make the semi-sync timeout very large:
./m <<< "set global rpl_semi_sync_master_timeout = $((24*60*60*100))"
# Create a test schema and table:
./m <<< "create database test_jfg; create table test_jfg.t(id int not null auto_increment primary key, v int)"
# Stop replication on replicas for commit to block:
sql="stop slave"; ./s1 <<< "$sql"; ./s2 <<< "$sql"
# Run a transaction that will block (done in the background to not block the shell):
./m <<< "insert into test_jfg.t(v) values(0)" &
# Show that the transaction is blocked (the two phantom replica connections are because of Bug#103189):
./m <<< "show processlist"
Id User Host db Command Time State Info
5 event_scheduler localhost NULL Daemon 69 Waiting on empty queue NULL
13 rsandbox localhost:38690 NULL Binlog Dump GTID 54 Master has sent all binlog to slave; waiting for more updates NULL
14 rsandbox localhost:38692 NULL Binlog Dump GTID 54 Master has sent all binlog to slave; waiting for more updates NULL
17 msandbox localhost NULL Query 5 Waiting for semi-sync ACK from slave insert into test_jfg.t(v) values(0)
18 msandbox localhost NULL Query 0 init show processlist
# Here, SHOW MASTER STATUS works:
./m <<< "show master status"
File Position Binlog_Do_DB Binlog_Ignore_DB Executed_Gtid_Set
mysql-bin.000001 9041 00021526-1111-1111-1111-111111111111:1-35
# But FLUSH BINARY LOGS blocks (done in the background to not block the shell):
./m <<< "flush binary logs" &
# And we confirm that things are blocked by listing sessions:
./m <<< "show processlist"
Id User Host db Command Time State Info
5 event_scheduler localhost NULL Daemon 212 Waiting on empty queue NULL
17 msandbox localhost NULL Query 148 Waiting for semi-sync ACK from slave insert into test_jfg.t(v) values(0)
21 msandbox localhost NULL Query 45 starting flush binary logs
22 msandbox localhost NULL Query 0 init show processlist
# And now, SHOW MASTER STATUS also blocks (done in the background to not block the shell):
./m <<< "show master status" &
# And we confirm that things are blocked by listing sessions:
./m <<< "show processlist"
Id User Host db Command Time State Info
5 event_scheduler localhost NULL Daemon 259 Waiting on empty queue NULL
17 msandbox localhost NULL Query 195 Waiting for semi-sync ACK from slave insert into test_jfg.t(v) values(0)
21 msandbox localhost NULL Query 92 starting flush binary logs
23 msandbox localhost NULL Query 30 starting show master status
24 msandbox localhost NULL Query 0 init show processlist
# Killing the FLUSH BINARY LOGS only releases the mysql client, not the session:
./m <<< "kill 21"; sleep 5; ./m <<< "show processlist"
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
[2]- Exit 1 ./m <<< "flush binary logs"
Id User Host db Command Time State Info
5 event_scheduler localhost NULL Daemon 342 Waiting on empty queue NULL
17 msandbox localhost NULL Query 278 Waiting for semi-sync ACK from slave insert into test_jfg.t(v) values(0)
21 msandbox localhost NULL Killed 175 starting flush binary logs
23 msandbox localhost NULL Query 113 starting show master status
26 msandbox localhost NULL Query 0 init show processlist
# Same for SHOW MASTER STATUS:
./m <<< "kill 23"; sleep 5; ./m <<< "show processlist"
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
[3]+ Exit 1 ./m <<< "show master status"
Id User Host db Command Time State Info
5 event_scheduler localhost NULL Daemon 387 Waiting on empty queue NULL
17 msandbox localhost NULL Query 323 Waiting for semi-sync ACK from slave insert into test_jfg.t(v) values(0)
21 msandbox localhost NULL Killed 220 starting flush binary logs
23 msandbox localhost NULL Killed 158 starting show master status
28 msandbox localhost NULL Query 0 init show processlist
# A way to unblock things is to disable semi-sync:
./m <<< "set global rpl_semi_sync_master_enabled = off"; sleep 5; ./m <<< "show processlist"
[1]+ Done ./m <<< "insert into test_jfg.t(v) values(0)"
Id User Host db Command Time State Info
5 event_scheduler localhost NULL Daemon 456 Waiting on empty queue NULL
30 msandbox localhost NULL Query 0 init show processlist