Bug #13861 START SLAVE UNTIL may stop 1 evnt too late if log-slave-updates and circul repl
Submitted: 8 Oct 2005 13:21 Modified: 29 Mar 19:10
Reporter: Guilhem Bichot
Status: Closed
Category:Server: Replication Severity:S3 (Non-critical)
Version:4.1,5.0 OS:Linux (linux)
Assigned to: Sergey Vojtovich Target Version:5.1.24
Triage: D2 (Serious)

[8 Oct 2005 13:21] Guilhem Bichot
Description:
This happens only if one MySQL replication slave has --log-slave-updates and is involved
in circular replication. Call that slave S and its master M.
Then if S is asked to START SLAVE UNTIL MASTER_LOG_POS=x where x is the position of an
event originating from S in M's binlog, then S will in fact stop one event too late. This
is because the event originating from S has been ignored by the slave I/O thread and so is
not present in the relay log.

How to repeat:
set up circular replication M<->S, with --log-slave-updates for both.
Do STOP SLAVE on S.
Issue one update on S and then one update on M. In M's binlog you should see the updates
in this order.
Then:
Do START SLAVE UNTIL MASTER_LOG_FILE=..., MASTER_LOG_POS=x;
on S where x  is the position of the end of the S's update in M's binlog.
Check that when the S's SQL thread stops, the M's update has been executed.
[16 Feb 14:04] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/42424

ChangeSet@1.2575, 2008-02-16 17:02:34+04:00, svoj@mysql.com +5 -0
  BUG#13861 - START SLAVE UNTIL may stop 1 evnt too late if
              log-slave-updates and circul repl
  
  Slave SQL thread may execute one extra event when there are events
  skipped by slave I/O thread (e.g. originated by the same server).
  Whereas it was requested not to do so by the UNTIL condition.
  
  This happens because we check for end position of previously executed
  event. This is fine when there are no skipped by slave I/O thread
  events, as end position of previous event equals to start position
  of to be executed event. Otherwise this position equals to start
  position of skipped event.
  
  This is fixed by:
  - reading to be executed event before checking if until is satisfied.
  - checking beginning position of to be executed event. As we do not
    store the beginning position anywhere, it is calculated by subtracting
    event length from the end position.
  - if there are no events on the event queue, that meet until condition,
    we stop immediately, as in this case we do not want to wait for next
    event.
[22 Feb 16:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/42840

ChangeSet@1.2575, 2008-02-22 19:07:07+04:00, svoj@mysql.com +6 -0
  BUG#13861 - START SLAVE UNTIL may stop 1 evnt too late if
              log-slave-updates and circul repl
  
  Slave SQL thread may execute one extra event when there are events
  skipped by slave I/O thread (e.g. originated by the same server).
  Whereas it was requested not to do so by the UNTIL condition.
  
  This happens because we compare with the end position of previously
  executed event. This is fine when there are no skipped by slave I/O
  thread events, as end position of previous event equals to start
  position of to be executed event. Otherwise this position equals to
  start position of skipped event.
  
  This is fixed by:
  - reading the event to be executed before checking if the until condition
    is satisfied.
  - comparing the start position of the event to be executed. Since we do
    not have the start position available, we compute it by subtracting
    event length from end position (which is available).
  - if there are no events on the event queue at the slave sql starting
    time, that meet until condition, we stop immediately, as in this
    case we do not want to wait for next event.
[27 Feb 16:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/43069

ChangeSet@1.2531, 2008-02-27 19:24:00+04:00, svoj@mysql.com +6 -0
  BUG#13861 - START SLAVE UNTIL may stop 1 evnt too late if
              log-slave-updates and circul repl
  
  After merge fixes.
[27 Feb 18:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/43093

ChangeSet@1.2531, 2008-02-27 21:46:06+04:00, svoj@mysql.com +6 -0
  BUG#13861 - START SLAVE UNTIL may stop 1 evnt too late if
              log-slave-updates and circul repl
  
  After merge fixes.
[28 Feb 12:48] Andrei Elkin
Approved after seeing the patch for 5.1.
[14 Mar 14:33] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/44000

ChangeSet@1.2595, 2008-03-14 17:17:03+04:00, svoj@mysql.com +2 -0
  BUG#13861 - START SLAVE UNTIL may stop 1 evnt too late if
              log-slave-updates and circul repl
  
  This is a test case fix for BUG#13861.
[27 Mar 12:21] Bugs System
Pushed into 5.1.24-rc
[27 Mar 12:21] Bugs System
Pushed into 5.0.60
[27 Mar 18:53] Bugs System
Pushed into 6.0.5-alpha
[29 Mar 19:10] Jon Stephens
Documented bugfix in the 5.0.60, 5.1.24, and 6.0.5 changelogs as follows:

        START SLAVE UNTIL MASTER_LOG_POS=*position* issued on a slave that was
        using --log-slave-updates and that was involved in circular replication
        would cause the slave to run and stop one event later than that
        specified by the value of *position*.
[29 Mar 20:40] Jon Stephens
Also documented in changelog for 5.1.23-ndb-6.3.11.