Bug #101948 heartbeat is not compatible with local info
Submitted: 10 Dec 2020 8:00 Modified: 31 Dec 2020 12:08
Reporter: Yan Huang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.7.29, 5.7.32 OS:Any
Assigned to: CPU Architecture:Any

[10 Dec 2020 8:00] Yan Huang
Description:
There is a next_position field in binlog event header, which is 4 bytes int. 

If the binlog is larger than 4G, the next_position will overflow.

Replication heartbeat will report error "heartbeat is not compatible with local info" when it find the next_position is smaller than current log position.

How to repeat:
1. On master, prepare two large transactions (whose binlog size > 4G), and commit the two at the same time (make sure the large transactions are in the same binlog) 
2. If the first transaction GTID is A:4, and second is A:5, run following commands on slave:

- stop slave; reset slave;
- reset master;
- set global gtid_purged = "A:1-4"
- change master to MASTER_HEARTBEAT_PERIOD = 0.01;
- change master to MASTER_AUTO_POSITION = 1;
- start slave io_thread;

It will make slave start replication from the position after the first large transaction, whose position is larger than 4G. Master will send a Heartbeat on that position, which will cause slave IO thread error.

Suggested fix:
add a next_position_high field in extra_headers of v4 event header, put the high bits in the field.
[31 Dec 2020 12:08] MySQL Verification Team
Hello Yan Huang,

Thank you for the report and feedback.
Verified as described on 5.7.32 build.

Thanks,
Umesh
[31 Dec 2020 12:11] MySQL Verification Team
Related - Bug #101955