Bug #104012 FLUSH BINARY LOGS blocks when a client is waiting for semi-sync ACK.
Submitted: 14 Jun 2021 22:38 Modified: 16 Jun 2021 7:50
Reporter: Jean-François Gagné Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:8.0.25, 5.7.34 OS:Any
Assigned to: CPU Architecture:Any

[14 Jun 2021 22:38] Jean-François Gagné
Description:
Hi,

running FLUSH BINARY LOGS while a session is "Waiting for semi-sync ACK from slave" does not return.  And while FLUSH BINARY LOGS is blocked, running SHOW MASTER STATUS also blocks.  This could result in a pile-up which will read max connections.  See "How to repeat" for details.

Note that killing a blocked FLUSH BINARY LOGS or a blocked SHOW MASTER STATUS does not free a session.

The workaround for this is to disable semi-sync.

I am classifying this as S2 (serious) as this can result in reaching max connections, which will make it very hard to apply the work-around.

Many thanks for looking into that,

Jean-François Gagné

How to repeat:
Note that the instructions below are for 8.0.25, but the same will happen for 5.7.34.

# Create a replication sandbox:
dbdeployer deploy replication mysql_8.0.25 --gtid --semi-sync

# Make the semi-sync timeout very large:
./m <<< "set global rpl_semi_sync_master_timeout = $((24*60*60*100))"

# Create a test schema and table:
./m <<< "create database test_jfg; create table test_jfg.t(id int not null auto_increment primary key, v int)"

# Stop replication on replicas for commit to block:
sql="stop slave"; ./s1 <<< "$sql"; ./s2 <<< "$sql"

# Run a transaction that will block (done in the background to not block the shell):
./m <<< "insert into test_jfg.t(v) values(0)" &

# Show that the transaction is blocked (the two phantom replica connections are because of Bug#103189):
./m <<< "show processlist"
Id      User    Host    db      Command Time    State   Info
5       event_scheduler localhost       NULL    Daemon  69      Waiting on empty queue  NULL
13      rsandbox        localhost:38690 NULL    Binlog Dump GTID        54      Master has sent all binlog to slave; waiting for more updates   NULL
14      rsandbox        localhost:38692 NULL    Binlog Dump GTID        54      Master has sent all binlog to slave; waiting for more updates   NULL
17      msandbox        localhost       NULL    Query   5       Waiting for semi-sync ACK from slave    insert into test_jfg.t(v) values(0)
18      msandbox        localhost       NULL    Query   0       init    show processlist

# Here, SHOW MASTER STATUS works:
./m <<< "show master status"
File    Position        Binlog_Do_DB    Binlog_Ignore_DB        Executed_Gtid_Set
mysql-bin.000001        9041                    00021526-1111-1111-1111-111111111111:1-35

# But FLUSH BINARY LOGS blocks (done in the background to not block the shell):
./m <<< "flush binary logs" &

# And we confirm that things are blocked by listing sessions:
./m <<< "show processlist"
Id      User    Host    db      Command Time    State   Info
5       event_scheduler localhost       NULL    Daemon  212     Waiting on empty queue  NULL
17      msandbox        localhost       NULL    Query   148     Waiting for semi-sync ACK from slave    insert into test_jfg.t(v) values(0)
21      msandbox        localhost       NULL    Query   45      starting        flush binary logs
22      msandbox        localhost       NULL    Query   0       init    show processlist

# And now, SHOW MASTER STATUS also blocks (done in the background to not block the shell):
./m <<< "show master status" &

# And we confirm that things are blocked by listing sessions:
./m <<< "show processlist"
Id      User    Host    db      Command Time    State   Info
5       event_scheduler localhost       NULL    Daemon  259     Waiting on empty queue  NULL
17      msandbox        localhost       NULL    Query   195     Waiting for semi-sync ACK from slave    insert into test_jfg.t(v) values(0)
21      msandbox        localhost       NULL    Query   92      starting        flush binary logs
23      msandbox        localhost       NULL    Query   30      starting        show master status
24      msandbox        localhost       NULL    Query   0       init    show processlist

# Killing the FLUSH BINARY LOGS only releases the mysql client, not the session:
./m <<< "kill 21"; sleep 5; ./m <<< "show processlist"
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
[2]-  Exit 1                  ./m <<< "flush binary logs"
Id      User    Host    db      Command Time    State   Info
5       event_scheduler localhost       NULL    Daemon  342     Waiting on empty queue  NULL
17      msandbox        localhost       NULL    Query   278     Waiting for semi-sync ACK from slave    insert into test_jfg.t(v) values(0)
21      msandbox        localhost       NULL    Killed  175     starting        flush binary logs
23      msandbox        localhost       NULL    Query   113     starting        show master status
26      msandbox        localhost       NULL    Query   0       init    show processlist

# Same for SHOW MASTER STATUS:
./m <<< "kill 23"; sleep 5; ./m <<< "show processlist"
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
[3]+  Exit 1                  ./m <<< "show master status"
Id      User    Host    db      Command Time    State   Info
5       event_scheduler localhost       NULL    Daemon  387     Waiting on empty queue  NULL
17      msandbox        localhost       NULL    Query   323     Waiting for semi-sync ACK from slave    insert into test_jfg.t(v) values(0)
21      msandbox        localhost       NULL    Killed  220     starting        flush binary logs
23      msandbox        localhost       NULL    Killed  158     starting        show master status
28      msandbox        localhost       NULL    Query   0       init    show processlist

# A way to unblock things is to disable semi-sync:
./m <<< "set global rpl_semi_sync_master_enabled = off"; sleep 5; ./m <<< "show processlist"
[1]+  Done                    ./m <<< "insert into test_jfg.t(v) values(0)"
Id      User    Host    db      Command Time    State   Info
5       event_scheduler localhost       NULL    Daemon  456     Waiting on empty queue  NULL
30      msandbox        localhost       NULL    Query   0       init    show processlist
[15 Jun 2021 13:50] Jean-François Gagné
Related: Bug#104015 is made worse because of this bug.
[16 Jun 2021 7:50] MySQL Verification Team
Hello Jean-François,

Thank you for the report and test case.
Verified as described with 8.0.25 build.

regards,
Umesh