Bug #27808 | Infinite looping in circular replication | ||
---|---|---|---|
Submitted: | 13 Apr 2007 13:02 | Modified: | 22 Oct 2008 6:44 |
Reporter: | Lars Thalmann | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
Version: | 5.1 | OS: | Any |
Assigned to: | Assigned Account | CPU Architecture: | Any |
[13 Apr 2007 13:02]
Lars Thalmann
[16 Apr 2007 21:37]
Lars Thalmann
This is how one would issue the statement: Case 1: ------- When the server B fails in A->B->C->A, one would: 1. Wait for C to process its entire relay log. Then as much info from B as possible have been received by C. 2. Execute on server C, CHANGE MASTER TO SERVER_ID_FILTER=B,C (where B,C are the numbers representing the servers) 3. Execute on server C, CHANGE MASTER TO MASTER_HOST=A Now we have a circle again, but smaller. Case 2: ------- - Replication A->C is set up by on C doing: 1. CHANGE MASTER TO MASTER_HOST=A, SERVER_ID_FILTER=C,D 2. START SLAVE - Replication D->B is set up by on B doing: 1. CHANGE MASTER TO MASTER_HOST=D, SERVER_ID_FILTER=A,B 2. START SLAVE
[4 Sep 2007 12:27]
Lars Thalmann
See also BUG#25998.
[5 May 2008 12:52]
Andrei Elkin
The patch is on Bug #25998 page.
[16 Jul 2008 20:13]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/49889 2717 Andrei Elkin 2008-07-16 Bug #25998 problems about circle replication Bug #27808 Infinite looping in circular replication In case of withdrawing one of the servers from the circular multi-master replication group events generated by the removed server could become unstoppable (bug#25998). That's because the originator had been the terminator of the own event flow. Other possibility of the unstoppable event is the cluster replication (bug#27808). In that case an event generated by a member of a cluster was replicated to another member, got accepted and executed. By that same time effects of the event had been already propagated across the cluster via the cluster communications. In order to avoid double-applying, a replication event generated by a co-member of the cluster should not be accepted. Both variations of the unstoppable replication event are fixable with introducing a new option for CHANGE MASTER: IGNORE_SERVER_IDS= (sid_1, sid_2, ... ) The option can be set to the empty list that resets. Fixed with implementing the feature. Properties of the feature: a. reporting an error if the id of an ignored server is the slave itself and its configuration on startup was with --replicate-same-server-id; b. overriding the existing IGNORE_SERVER_IDS list by the following CHANGE MASTER ... IGNORE_SERVER_IDS= (list), the empty list arg nullifies the current ignored list; c. preserving the existing list by CHANGE MASTER w/o IGNORE_SERVER_IDS; d. preserving the ignored server ids after RESET SLAVE; e. extending SHOW SLAVE STATUS with a new line listing ignored servers; f. a new line in master.info with the list of ignored servers; g. Differently from --replicate-same-server-id handling, the sql thread is not concerned with the ignored server ids, because it's supposed that the relay log consists only of events that can not be unstoppable. In order to guarantee that, e.g in case of the circular replication with a failing server DBA needs to change master necessarily using the new option. h. Rotate and FD events originated by the current master listed in the ignored list are still relay-logged which does not create any termination issue. i. The possible list of ignored servers is sorted for the fastest processing of filtering algorithm. Two new lines to show slave status output are added: the list of ignored servers and the current master server id (yet another feature for the user!). Use cases for this feature can be found on the bug report page.
[17 Jul 2008 19:12]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/49968 2673 Andrei Elkin 2008-07-17 Bug #25998 problems about circle replication Bug #27808 Infinite looping in circular replication In case of withdrawing one of the servers from the circular multi-master replication group events generated by the removed server could become unstoppable (bug#25998). That's because the originator had been the terminator of the own event flow. Other possibility of the unstoppable event is the cluster replication (bug#27808). In that case an event generated by a member of a cluster was replicated to another member, got accepted and executed. By that same time effects of the event had been already propagated across the cluster via the cluster communications. In order to avoid double-applying, a replication event generated by a co-member of the cluster should not be accepted. Both variations of the unstoppable replication event are fixable with introducing a new option for CHANGE MASTER: IGNORE_SERVER_IDS= (sid_1, sid_2, ... ) The option can be set to the empty list that resets. Fixed with implementing the feature. Properties of the feature: a. reporting an error if the id of an ignored server is the slave itself and its configuration on startup was with --replicate-same-server-id; b. overriding the existing IGNORE_SERVER_IDS list by the following CHANGE MASTER ... IGNORE_SERVER_IDS= (list), the empty list arg nullifies the current ignored list; c. preserving the existing list by CHANGE MASTER w/o IGNORE_SERVER_IDS; d. preserving the ignored server ids after RESET SLAVE; e. extending SHOW SLAVE STATUS with a new line listing ignored servers; f. a new line in master.info with the list of ignored servers; g. Differently from --replicate-same-server-id handling, the sql thread is not concerned with the ignored server ids, because it's supposed that the relay log consists only of events that can not be unstoppable. In order to guarantee that, e.g in case of the circular replication with a failing server DBA needs to change master necessarily using the new option. h. Rotate and FD events originated by the current master listed in the ignored list are still relay-logged which does not create any termination issue. i. The possible list of ignored servers is sorted for the fastest processing of filtering algorithm. Two new lines to show slave status output are added: the list of ignored servers and the current master server id (yet another feature for the user!). Use cases for this feature can be found on the bug report page.
[18 Jul 2008 7:33]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/50006 2673 Andrei Elkin 2008-07-17 Bug #25998 problems about circle replication Bug #27808 Infinite looping in circular replication In case of withdrawing one of the servers from the circular multi-master replication group events generated by the removed server could become unstoppable (bug#25998). That's because the originator had been the terminator of the own event flow. Other possibility of the unstoppable event is the cluster replication (bug#27808). In that case an event generated by a member of a cluster was replicated to another member, got accepted and executed. By that same time effects of the event had been already propagated across the cluster via the cluster communications. In order to avoid double-applying, a replication event generated by a co-member of the cluster should not be accepted. Both variations of the unstoppable replication event are fixable with introducing a new option for CHANGE MASTER: IGNORE_SERVER_IDS= (sid_1, sid_2, ... ) The option can be set to the empty list that resets. Fixed with implementing the feature. Properties of the feature: a. reporting an error if the id of an ignored server is the slave itself and its configuration on startup was with --replicate-same-server-id; b. overriding the existing IGNORE_SERVER_IDS list by the following CHANGE MASTER ... IGNORE_SERVER_IDS= (list), the empty list arg nullifies the current ignored list; c. preserving the existing list by CHANGE MASTER w/o IGNORE_SERVER_IDS; d. preserving the ignored server ids after RESET SLAVE; e. extending SHOW SLAVE STATUS with a new line listing ignored servers; f. a new line in master.info with the list of ignored servers; g. Differently from --replicate-same-server-id handling, the sql thread is not concerned with the ignored server ids, because it's supposed that the relay log consists only of events that can not be unstoppable. In order to guarantee that, e.g in case of the circular replication with a failing server DBA needs to change master necessarily using the new option. h. Rotate and FD events originated by the current master listed in the ignored list are still relay-logged which does not create any termination issue. i. The possible list of ignored servers is sorted for the fastest processing of filtering algorithm. Two new lines to show slave status output are added: the list of ignored servers and the current master server id (yet another feature for the user!). Use cases for this feature can be found on the bug report page.
[22 Oct 2008 6:42]
Lars Thalmann
Re-opening this bug. A bug should only be set to "duplicate" if there is a reference to what bug it is duplicate to.
[22 Oct 2008 6:44]
Lars Thalmann
Duplicate of BUG#25998.
[30 Jan 2009 13:27]
Bugs System
Pushed into 6.0.10-alpha (revid:luis.soares@sun.com-20090129165607-wiskabxm948yx463) (version source revid:luis.soares@sun.com-20090129163120-e2ntks4wgpqde6zt) (merge vers: 6.0.10-alpha) (pib:6)