Description:
Hi,
I've been facing this issue suddenly on my SSD(Virident SLC SSDs) based mysql master-slave sets.
Replication on slave breaks with the following error:
"Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave."
the mysql error log says:
Nov 16 19:26:39 db5 mysqld: 111116 19:26:39 [ERROR] Error in Log_event::read_log_event(): 'Event too big', data_len: 218136064, event_type: 105
Nov 16 19:26:39 db5 mysqld: 111116 19:26:39 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error
Nov 16 19:26:39 db5 mysqld: 111116 19:26:39 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 1594
Nov 16 19:26:39 db5 mysqld: 111116 19:26:39 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'db4-bin.000525' position 941611524
Issuing mysqlbinlog on both the master and relay logs shows that the logs are NOT corrupt but the replication seems to think that it is corrupt and does not move. If I do a skip_slave I am presented with a new error:
I've tried to take a snapshot of the master and recreated multiple slaves multiple times. Replication breaks at random places and never breaks at the same binlog position. The issue repeats itself randomly.
Both master slave run the same OS, Mysql, and configurations.
----
mysql Ver 14.14 Distrib 5.1.49, for debian-linux-gnu (x86_64) using readline 5.2
Linux erp-wsr-db3 2.6.26-2-amd64 #1 SMP Wed Sep 21 03:36:44 UTC 2011 x86_64 GNU/Linux
How to repeat:
No obvious process to repeat.