Bug #72488 Binlog corruption on sync failure
Submitted: 30 Apr 2014 10:09 Modified: 10 Dec 2017 10:06
Reporter: Daniël van Eeden (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.6.16 OS:Any
Assigned to: CPU Architecture:Any
Tags: binlog, corruption, nfs, sync

[30 Apr 2014 10:09] Daniël van Eeden
Description:
A master with RBR and sync_binlogs=1 on Linux (RHEL6).

There is some issue with the NFS filesystem on which only the binlogs (and binlog index) are located.

Then replication breaks:
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'bogus data in log event; the first event 'server1-bin.000182' at 865668285, the last event read from '/binlogs/server1-bin.000182' at 865668285, the last byte read from '/binlogs/server1-bin.000182' at 865668304.'

There is nothing being logged in the errorlog.

One of the applications was able to give more info:
Mysqli statement execute error : Can't sync file '/binlogs/server1-bin.000182' to disk (Errcode: 5 - Input/output error)

Looking at the sourcecode this seems to be the EE_SYNC error.

$ mysqlbinlog --base64-output=decode-rows --start-position=865668285 server1-bin.000182
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
ERROR: Error in Log_event::read_log_event(): 'Event too small', data_len: 0, event_type: 0
ERROR: Could not read entry at offset 865668285: Error in log format or read error.
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

$ hexdump -s 865668285 server1-bin.000182 | head -3
33990cbd 0000 0000 0000 0000 0000 0000 0000 0000
*
33b104fd 0000 0000 0000 0000 0000 0000 ac8f 5360

The size of the binlog file is 930664247.

How to repeat:
Have binlogs on a different filesystem and set sync_binlog=1.

Somehow get the sync operation to fail with error 5.

Then check the binlog contents and replication.

Suggested fix:
1. Make sure these kinds of errors are logged in the error log.
2. Do not continue operation until we have succesfully synced the binlog.
3. Retry the sync and the directly rotate the binlogs.
[30 Apr 2014 10:10] Daniël van Eeden
Added tags and raised the severity
[30 Apr 2014 11:39] Daniël van Eeden
Might be related to Bug #68953
[21 Aug 2014 18:05] Daniël van Eeden
Since 5.6.20 there is binlog_impossible_mode which improves this a lot.

http://dev.mysql.com/doc/refman/5.6/en/replication-options-binary-log.html#sysvar_binlog_i...
[10 Dec 2017 10:06] Daniël van Eeden
It is probably difficult to figure out if MySQL or the storage caused the corruption. However with binlog_impossible_mode the situation has improved quite a bit. So closing this for now.