Bug #92346 Slave stops with error after processing log file with large number of entries
Submitted: 9 Sep 2018 16:57 Modified: 11 Sep 2018 16:04
Reporter: Thomas Smith Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.7.22 OS:CentOS (6.5)
Assigned to: MySQL Verification Team CPU Architecture:Any

[9 Sep 2018 16:57] Thomas Smith
Description:
Slave stops processing with the error "Relay log read failure - Could not parse relay log event entry".  That is most often due to master or slave being out of disk space, but not in this case.  We've observed this 3 times in similar circumstances.

The current log position is shown as 2069186750.  However, that is not a valid position in the named log.  In fact, the log *was* completely processed.  The final position is 10659121342.

Note the large value - Over 10G. Our logs are normally rotated well before this but a large transaction can legitimately cause logs to exceed their normal limit.  I suspect there is an integer overflow in the slave processing, perhaps related to #77818.

The recovery is to manually set the slave to the first position in the next log and restart.

How to repeat:
Generate a log file with a large number of entries.

Suggested fix:
Find and fix the integer overflow.
[10 Sep 2018 11:12] MySQL Verification Team
Take a look at this:

https://bugs.mysql.com/bug.php?id=55231
(COM_BINLOG_DUMP needs to accept 64-bit positions else slaves can break)
[10 Sep 2018 11:54] MySQL Verification Team
As my colleague already pointed out, this is a duplicate of 
https://bugs.mysql.com/bug.php?id=55231
[10 Sep 2018 12:26] MySQL Verification Team
workaround as explained in https://bugs.mysql.com/bug.php?id=55231

[quote]
workaround is to set a lower max_binlog_cache_size, around this size:
(4*1024*1024*1024) - max_binlog_size
maybe some overhead bytes can be deducted too.
[/quote]
[10 Sep 2018 13:45] Thomas Smith
Thanks guys, I'll try the workaround.  But since the referenced open issue was last commented on in 2011, I won't hold my breath for an actual fix.
[10 Sep 2018 17:35] Thomas Smith
The documentation actually says "The maximum recommended value is 4GB; this is due to the fact that MySQL currently cannot work with binary log positions greater than 4GB."  Is that accurate, or should it say "...4GB - max_binlog_size" as you are recommending?
[11 Sep 2018 16:04] Thomas Smith
Suggested workaround fails.  It causes the master to reject the (single) query with the error

Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage; increase this mysqld variable and try again