Bug #104491 Contribution by Facebook: Fix race between binlog sender heartbeat timeout ...
Submitted: 30 Jul 2021 22:12 Modified: 18 Jul 2022 16:19
Reporter: FBContrib Admin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:8.0.20 OS:Any
Assigned to: CPU Architecture:Any

[30 Jul 2021 22:12] FBContrib Admin
Description:
Background innformation provided by Facebook:
Abstract:

This change reintroduces previously removed code to handle the race between the
binlog sender thread running with heartbeat and binlog update. The race happens
when the sender wait for updates exits with timeout and misses the signal, but
it doesn't check for binlog update and loops to wait again.
Commit that removed the signal count check:
https://github.com/mysql/mysql-server/commit/ced9292eb87b061fb7b8ac2190f01e9dec18f3d7 

Use case:
Fixes a bug where the primary server stops sending binlog entries to the secondary.

Repo: https://github.com/mysql/mysql-server 

Patch on top of 8.0.20: https://github.com/mysql/mysql-server/commit/7d10c82196c 

How to repeat:
See description

Suggested fix:
See contribution code attached
[30 Jul 2021 22:12] FBContrib Admin
Fix race between binlog sender heartbeat timeout and signal 
(*) This code is contributed under the Facebook agreement

Contribution: fb_patch_254.txt (text/plain), 6.14 KiB.

[18 Jul 2022 16:19] Jon Stephens
Documented fix as follows in the MySQL 8.0.31 changelog:

    When the binary log sender thread waited for updates with
    heartbeats enabled, it sometimes missed update signals, so that
    changes were not replicated until the next signal was issued and
    noticed by the thread.

    Our thanks to Facebook for the contribution.

Closed.