Bug #25737 Please add checksum to binlog events
Submitted: 20 Jan 2007 18:41 Modified: 9 Feb 2011 21:42
Reporter: Baron Schwartz (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S4 (Feature request)
Version: OS:Any
Assigned to: Andrei Elkin
Tags: bfsm_2007_10_18, qc, replication checksum
Triage: Triaged: D5 (Feature request)

[20 Jan 2007 18:41] Baron Schwartz
Description:
I would like the binlog to include checksums for each event so binlog corruption can be detected better on the slaves.

How to repeat:
Feature request.
[7 Feb 2007 16:30] Valerii Kravchuk
Thank you for a reasonable feature request.
[7 Feb 2007 19:00] Jeremy Cole
I would love to see this done -- in fact I suggested it myself perhaps 2 years ago at least. :)  I don't think it should be that hard...
[17 May 2007 8:10] Richard George
This is near-critical for us, see #21623.
[22 Aug 2007 21:46] James Day
Try an SSL connection to get that extra layer of integrity checking.
[23 Aug 2007 12:45] Baron Schwartz
That only ensures the bits don't get garbled on the wire.  It gives no assurances to the Slave SQL thread that it is reading unmangled data.
[4 Sep 2007 0:04] James Day
Agreed but corruption on the wire seems to be by far the most common cause, so it's worth mentioning it. No law against disk/RAM issues though - those happen too.

I asked the replication team for this in person two years ago at one of our developer meetings and will be doing the same in a couple of weeks.
[20 Sep 2007 12:35] Mark Callaghan
I want this to. We had hardware problem that flipped a bit on some ascii characters so that the result was readable but wrong. TCP checksums passed but the queries failed, MySQL was blamed, and debugging was required until the HW problem was found.
[20 Sep 2007 23:38] James Day
The replication team agrees that this is desirable for both the IO (to catch it fast) and SQL (to catch memory/disk issues) threads. It's also been demonstrated, including by Mark's example, that the TCP 4 byte checkum is not sufficiently sensitive, so it'll need to be larger or better.

Retrying after corrupt binary log events, and logging the surrounding events in case it's a bug rather than corruption, also agreed.

No timetable for when to do this at present, still too early in the process for that.
[14 Oct 2007 1:47] James Day
See bug #29813 "replication errors on a unstable network" for another report of corruption on unstable VPN connections.
[14 Oct 2007 2:03] James Day
The visible worklog item for this is at http://forge.mysql.com/worklog/task.php?id=2540 , which started in April 2005.
[28 Oct 2007 19:24] James Day
Checksums for replication events are currently on the server roadmap for the version after 6.0, with a current target of Q1 2009 release, subject to change.
[8 Aug 2008 21:07] Jeremy Zawodny
Allow me to add a vote for this as well.

We got bit by this on two servers yesterday because of a network glitch.  A checksum could have found the error an re-requested the problematic event.
[23 Sep 2008 2:52] Gonzalo Carvajal
Yes! please i need that feature too.
I have 4 servers, 1 master and 3 slaves. They are not in a LAN, they are in different cities. I configure the my.ini in the slave for replicating just a few tables, not all the database and very often i have replication crashes because every modification on Master is sent to the slaves, not only the tables i want. The more common error is because sql statements (until now, they┬┤re all for tables other than i want to replicate) are not crc
verified and arrive to slaves with syntax errors (wrong data). 

Please sorry my english, i hope you can understand
I'm working with 5.0.45-community-nt-log, MySQL Community Edition (GPL)
[23 Sep 2008 15:57] Sven Sandberg
Gonzalo, you are probably experiencing BUG#26489. You may try to upgrade to version 5.0.56 where that was fixed.
[12 Mar 2009 2:14] Baron Schwartz
Looks like Google has done this as part of another patch.

http://mysqlha.blogspot.com/2009/03/global-transaction-ids-are-hot.html
[10 Feb 2010 10:55] Sven Sandberg
This task is tracked by WL#2540, not by this bug
[8 Feb 2011 15:03] Andrei Elkin
The feature request was addressed by wl#2540 replication event checksum.
[8 Feb 2011 15:15] Mark Callaghan
What release has or will have this feature? While I get the restrictions against stating what will be in a future release, I thought the standard policy for closing a bug or a feature request is to document the release in which it has been fixed. And yes, Catch-22 was a favorite book of mine.
[8 Feb 2011 15:27] Luis Soares
Mark, this task efforts were/are actually tracked in WL#2540, which 
has been merged into a development tree (mysql-trunk).

We had mistakenly set this bug status to CLOSED, but corrected its 
state to DOCUMENTING. Once WL#2540 gets documented and released, 
this bug/FR shall be closed as well.
[8 Feb 2011 15:33] Matt Lord
So is this in 5.6.0?  Will it be in 5.6.1?  Some later version?
[9 Mar 2011 8:48] Shane Bester
see http://forge.mysql.com/worklog/task.php?id=2540 for the worklog.
[26 Apr 2011 6:29] James Day
This feature is added in the MySQL 5.6.2 Development Milestone Release. This is  preview of technology that may be in the next major MySQL version, not a current production-rated release. In general these are intended to be of release candidate quality.

Don't expect it to be added to MySQL 5.5 or any earlier production release.

For other replication features added to 5.6.2 and now announced see http://blogs.oracle.com/mysql/2010/11/mysql_55_whats_new_in_replication.html . Presence of a feature in a Development Milestone Release is not a guarantee that it will be present in a future production-rated release.