Bug #113598 Please consider adding GTID in "Timeout waiting for reply".
Submitted: 9 Jan 20:33 Modified: 10 Jan 6:01
Reporter: Jean-François Gagné Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S4 (Feature request)
Version:8.2.0, 8.0.35 OS:Any
Assigned to: CPU Architecture:Any

[9 Jan 20:33] Jean-François Gagné
Description:
Hi,

when semi-sync times-out, a line like below can be found in the error logs.  As you can see, it mentions the file and position of the transaction that timed-out, but not the GTID.  Having the GTID would be useful.  One use-case is to be able to check if this GTID reached a replica after a failover.

2024-01-09T20:08:15.669362Z 16 [Warning] [MY-011153] [Repl] Timeout waiting for reply of binlog (file: mysql-bin.000001, pos: 8528), semi-sync up to file , position 4.

How to repeat is done with 8.2.0, but it also applies to 8.0.35.

Note that I see that this change is not super straightforward, as the place in the code where this log is produced ([1] and [2]) does not have direct access to the GTID of the transaction, but I think this should be overcome to have more useful logs.

[1]: https://github.com/mysql/mysql-server/blob/mysql-8.2.0/plugin/semisync/semisync_source.cc#...

[2]: https://github.com/mysql/mysql-server/blob/mysql-8.2.0/share/messages_to_error_log.txt#L35...

Many thanks for looking into this,

Jean-Françcois Gagné

How to repeat:
dbdeployer deploy replication mysql_8.2.0 --gtid --semi-sync

./m <<< "set global rpl_semi_sync_master_timeout = 1000"

./s1 <<< "STOP REPLICA"
./s2 <<< "STOP REPLICA"

# In below, I am using SHOW MASTER STATUS
#   because SHOW BINARY LOG STATUS is not available in 8.0.35
#   and I want the test to also work in 8.0.35.
date; time ./m <<< "
  show master status\G
  create database test_jfg; 
  show master status\G
"

Tue Jan  9 20:08:14 UTC 2024
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 8331
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 00019201-1111-1111-1111-111111111111:1-33
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 8528
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 00019201-1111-1111-1111-111111111111:1-34

real    0m1.047s
user    0m0.015s
sys     0m0.000s

grep "Timeout waiting for reply of binlog" master/data/msandbox.err

2024-01-09T20:08:15.669362Z 16 [Warning] [MY-011153] [Repl] Timeout waiting for reply of binlog (file: mysql-bin.000001, pos: 8528), semi-sync up to file , position 4.

Suggested fix:
I would suggest the log line to look like below:

2024-01-09T20:08:15.669362Z 16 [Warning] [MY-011153] [Repl] Timeout waiting for reply of binlog (file: mysql-bin.000001, pos: 8528, gtid: 00019201-1111-1111-1111-111111111111:34), semi-sync up to file , position 4.

If GTIDs are not enabled, we could put ANONYMOUS instead of the GTID:

2024-01-09T20:08:15.669362Z 16 [Warning] [MY-011153] [Repl] Timeout waiting for reply of binlog (file: mysql-bin.000001, pos: 8528, gtid: ANONYMOUS), semi-sync up to file , position 4.

To allow a smooth transition (avoid breaking change), this could be controlled with a new global variable (suggested name rpl_semi_sync_master_timeout_log_gtid) with possible values ON or OFF.  The default would be OFF in 8.0.x, and ON in the next major release.  Additionally, this variable could be deprecated in the next major release.
[9 Jan 20:34] Jean-François Gagné
Changing from "S3 (Non-critical)" to "S4 (Feature request)"; sorry for the mistake.
[10 Jan 6:01] MySQL Verification Team
Hello Jean-François,

Thank you for the feature request!

regards,
Umesh