MySQL Bugs: #63286: Error in Log_event::read_log

Bug #63286	Error in Log_event::read_log_event(): 'Event too big',
Submitted:	16 Nov 2011 14:18	Modified:	19 Jan 2012 19:22
Reporter:	Pankaj Kaushal	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S2 (Serious)
Version:	5.1.49-3	OS:	Linux (Debian Squeeze (6.0))
Assigned to:		CPU Architecture:	Any
Tags:	5.1.49-3, breaks, Debian, linux, MySQL, replication

Description:
Hi,

I've been facing this issue suddenly on my SSD(Virident SLC SSDs) based mysql master-slave sets.

Replication on slave breaks with the following error:

"Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave."

the mysql error log says:

Nov 16 19:26:39 db5 mysqld: 111116 19:26:39 [ERROR] Error in Log_event::read_log_event(): 'Event too big', data_len: 218136064, event_type: 105
Nov 16 19:26:39 db5 mysqld: 111116 19:26:39 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error
Nov 16 19:26:39 db5 mysqld: 111116 19:26:39 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 1594
Nov 16 19:26:39 db5 mysqld: 111116 19:26:39 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'db4-bin.000525' position 941611524

Issuing mysqlbinlog on both the master and relay logs shows that the logs are NOT corrupt but the replication seems to think that it is corrupt and does not move. If I do a skip_slave I am presented with a new error:

I've tried to take a snapshot of the master and recreated multiple slaves multiple times. Replication breaks at random places and never breaks at the same binlog position. The issue repeats itself randomly.

Both master slave run the same OS, Mysql, and configurations.

----
mysql Ver 14.14 Distrib 5.1.49, for debian-linux-gnu (x86_64) using readline 5.2
Linux erp-wsr-db3 2.6.26-2-amd64 #1 SMP Wed Sep 21 03:36:44 UTC 2011 x86_64 GNU/Linux

How to repeat:
No obvious process to repeat.

Thank you for the report.

Which value of max_allowed_packet you use on master and which on slave?

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".