| Bug #70669 | Slave can't continue replication after master's crash recovery | ||
|---|---|---|---|
| Submitted: | 20 Oct 2013 4:25 | Modified: | 27 Feb 2014 13:16 |
| Reporter: | Yoshinori Matsunobu (OCA) | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
| Version: | 5.6.14 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[20 Oct 2013 4:25]
Yoshinori Matsunobu
[21 Oct 2013 8:32]
MySQL Verification Team
Hello Yoshinori, Thank you for the bug report. Verified as described. Thanks, Umesh
[27 Feb 2014 13:16]
Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.
Fixed in 5.6+. Documented fix in the 5.6.17 and 5.7.4 changelogs as follows:
Binary log events could be sent to slaves before they were flushed
to disk on the master, even when sync_binlog was set to 1. This
could lead to either of those of the following two issues when
the master was restarted following a crash of the operating
system:
·Replication cannot continue because one or more slaves are
requesting replicate events that do not exist on the master.
·Data exists on one or more slaves, but not on the master.
Such problems are expected on less durable settings (sync_binlog
not equal to 1), but it should not happen when sync_binlog is 1.
To fix this issue, a lock (LOCK_log) is now held during
synchronization and is released only after the binary events are
actually written to disk.
Closed.
If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at
http://dev.mysql.com/doc/en/installing-source.html
[29 Mar 2014 8:25]
Laurynas Biveinis
5.6$ bzr log -r 5838
------------------------------------------------------------
revno: 5838
committer: Libing Song <libing.song@oracle.com>
branch nick: mysql-5.6
timestamp: Tue 2014-02-25 09:39:34 +0800
message:
BUG#17632285 SLAVE CAN'T CONTINUE REPLICATION AFTER MASTER'S
CRASH RECOVERY
Binary events might be sent to slaves before they are flushed
to disk on master, even sync_binlog is set to 1. It can cause
two problems if the master restarts after an OS crash.
* Replication cannot continue because the slaves are
requesting to replication the events don't exist on master.
* Data exists on slaves, but not exists on the master.
The problems are expected on less durable settings(
sync_binlog != 1), but it should not happen on durable
setting(sync_binlog = 1).
Since 5.6 binlog group commit implementation, binlog write
and sync have been protected by separate mutexes. So dump
threads can read the binary events simultaneously or even
before it is synced to disk.
To fixing the problem on durable setting, LOCK_log is hold
in sync stage and it is released after the binary events are
synced to disk.
