Bug #70922 MySQL binlog error causing slave replication to exit
Submitted: 15 Nov 2013 15:21 Modified: 9 Jan 2015 7:18
Reporter: Michael Hazen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.6.14 OS:Linux (el6.x86_64)
Assigned to: Kyle Joiner CPU Architecture:Any

[15 Nov 2013 15:21] Michael Hazen
Description:
We have a 4-node replication set up using GTID and row-based replication. Two of the nodes are master-master replication. The remaining two nodes each has one of the master node as its master. 

In the test case we have, devhpg8ems01 and devhpg8ems02 are the master nodes. qahpg8ems01 is slave of devhpg8ems01, and qahpg8ems02 is slave of devhpg8ems02. Even though devhpg8ems01 and devhpg8ems02 are both masters, at any one time, only one server incur any writes. So in this case, only devhpg8ems01 has active external writes into the database.

Once the 4-node replication is set up, we've run into issues when the master binlog has error which causes the slave thread on the slaves to exit. In this case, devhpg8ems01 has binlog error, and we receive MySQL error code 1236 on devhpg8ems02 and qahpg8ems01. This has been consistently happening in our setup. Sometime, it takes less than 2 hours, sometime as long as 12 hours.

Before we run into this issue, we have had master-master setup by itself, and the replication runs without any problem. So this issue seems to be specific to our 4-node replication setup.

 

How to repeat:
Setting up a 4-node replication as above will cause the problem to occur.
[15 Nov 2013 18:45] Michael Hazen
As this is preventing replication to function at all in this particular 4-node replication topology we're using, I'm raising severity to Critical.
[19 Nov 2013 19:32] Sveta Smirnova
Thank you for the report.

We really need your binary log file to start working on this, but ftp.oracle.com is really seems to be not accessible atm. I just wrote our service developers about this issue and come back after I know where you can download this file. I am sorry for the inconvenience.
[20 Nov 2013 16:41] Michael Hazen
We have an SR opened for this problem: SR 3-8124825911. I've got MySQL support actively looking at this issue.
[4 Dec 2013 19:25] Shane Bester
Already filed internally as:
Bug 17842137 - ANOTHER BINLOG CORRUPTION CASE...
[4 Dec 2013 19:28] Shane Bester
php testcase.

Attachment: bug70922.php (application/octet-stream, text), 1.51 KiB.

[9 Jan 2015 7:18] Erlend Dahl
Fixed under the heading of Bug#17842137 in 5.6.16 and 5.7.4.