Bug #52166 Seconds_Behind_Master spikes after long idle period
Submitted: 18 Mar 2010 7:35 Modified: 15 Nov 2010 14:14
Reporter: Mikiya Okuno Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.1,5.5 OS:Any
Assigned to: Luis Soares CPU Architecture:Any

[18 Mar 2010 7:35] Mikiya Okuno
Description:
Seconds_Behind_Master in SHOW SLAVE STATUS command is calculated like below:

      long time_diff= ((long)(time(0) - mi->rli.last_master_timestamp)
                       - mi->clock_diff_with_master);

Unless last_master_timestamp is zero, time_diff is displayed as Seconds_Behind_Master. The problem is last_master_timestamp is a timestamp for the event which has been executed previously. last_master_timestamp value is preserved even after long idle period when there is no updates on the tables.

This results in a sudden spike, a huge value in other words, on Seconds_Behind_Master if SHOW SLAVE STATUS command is executed when a very first event is executed after long idle period.

How to repeat:
Setup a replication with SBR.

Create a test table on the master like below:

mysql> CREATE TABLE t (a int);

Don't update the master for long time.

Insert a row like below:

mysql> INSERT INTO t VALUES(SLEEP(10));

Observe SHOW SLAVE STATUS output at the very moment when the statement is executed on the slave. You can see a large Seconds_Behind_Master value.

Suggested fix:
On sql/slave.cc, initialize rli->last_master_timestamp with time(0) instead of save_timestamp value.

From:
        rli->last_master_timestamp= save_timestamp;
To:
        rli->last_master_timestamp= time(0);

While waiting for a new event, rli->last_master_timestamp is set to zero. Right after get a new event, (time(0) - rli->last_master_timestamp) should be zero as well.

Due to this change, we can remove save_timestamp variable.
[18 Mar 2010 8:44] Andrei Elkin
Confirming logics of  next_event() indeed allows SSS to display a 
Seconds_behind as
time(0) - the-last-processed-event-last_master_timestamp for duration of
processing a long query.
So if in between of the reported

   INSERT INTO t VALUES(SLEEP(10))

current event and an event right before it there was 100 seconds,
the 100 seconds of Seconds_behind would be displayed for 10 seconds of
processing of the INSERT.

Resetting last_master_timestamp at start of reading from relay log
was a reasonable technique in past when master could not send heartbeats.
With sending a HB event master ensures that it does not have any real 
event so that only when slave receives the HB it can rightfully reset last_master_timestamp.
[19 Mar 2010 4:36] MySQL Verification Team
a patch for mysql 5.1

Attachment: bug52166.patch (application/octet-stream, text), 689 bytes.

[15 Apr 2010 16:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/105760

3183 Luis Soares	2010-04-15
      BUG#52166: Seconds_Behind_Master spikes after long idle period
      
      Introduction: seconds behind master is calculated by checking the
      time difference between the current time on the slave (time(0))
      and the creation timestamp, at the master, of the last executed
      event by the SQL thread. The last event timestamp is saved by the
      SQL thread in rli->last_master_timestamp. Different timezones for
      master and slave are taken into account when showing this
      delta. The rli->last_master_timestamp is updated when the SQL
      thread finishes the event execution at the slave.
      
      Problem: when the SQL thread has no events to replay in the relay
      log, it waits. Right before waiting, it saves the last
      event (lets call it e1) timestamp (rli->last_master_timestamp)
      set at the master (in save_last_master_timestamp), and sets
      rli->last_master_timestamp to zero. This makes 'SHOW SLAVE
      STATUS' show zero while the SQL thread is waiting. However, when
      the SQL thread awakes, most likely because of the fact that a new
      event was enqueued in the relay log, it restores the
      save_last_master_timestamp to rli->last_master_timestamp. Then it
      goes on to process the new event (lets name it e2). Given this
      and the fact that the SQL thread only updates the
      rli->last_master_timestamp at the end of execution, there is a
      time window that the user can notice a sudden spike on
      Seconds_Behind_Master, especially if e2 arrives a long time after
      e1 finished. The spike is actually enforced - value increases -
      during e2 execution time, which means that if e2 execution
      latency increases, then the likelihood that the user sees this
      also increases.
      
      Solution: We fix this by not saving and restoring the
      last_master_timestamp when the SQL thread goes to sleep. Instead,
      we set the rli->last_master_timestamp to zero, when the SQL
      thread starts waiting, and keep it to zero, when the SQL thread
      awakes. A complementary change in this patch, is the one that
      sets rli->last_master_timestamp before and not after event
      execution. Both of these changes make the spike go away, and keep
      showing zero, until the event starts being processed, as opposed
      to showing zero until the event finishes its execution.
     @ mysql-test/suite/manual/t/rpl_replication_delay.test
        Removed this unmaintained test case.
     @ mysql-test/suite/rpl/t/seconds_behind_master.test
        Added some new tests. Collected remains from the removed 
        rpl_replication_delay, refactored and added in this file as well.
     @ sql/log_event.cc
        Reworked the fault injection to meet the new test case needs.
        Refactored the stmt_done interface (does not need 'when' anymore.
     @ sql/log_event_old.cc
        Accomodating changes in the stmt_done interface.
     @ sql/rpl_rli.cc
        Updated stmt_done implementation.
     @ sql/rpl_rli.h
        Changed stmt_done interface, removing the not needed 
        event_creation_time.
     @ sql/slave.cc
        Two changes: 
          (i) removed the save and restore of rli->last_master_timestamp, 
              when the SQL thread goes to sleeps. 
         (ii) Updating the rli->last_master_timestamp at the beginning of
              the event execution.
[30 Apr 2010 11:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107068

3184 Luis Soares	2010-04-30
      BUG#52166: Seconds_Behind_Master spikes after long idle period
      
      Addressing review comments:
        - Added comment to the assertion introduced in log_event.cc
        - Taking into consideration exec_time when setting
          rli->last_master_timestamp
        - Improved test case comemnts
        - Better time boundaries on the time based assertions
        - Improved robustness of the test case
        - Added source include/show_rpl_debug_info.inc when test dies
          on purpose (failed assertion).
        - Minor test simplifications.
     @ sql/log_event.cc
        Sets exec_time(0) in Log_event::Log_event constructor. Events
        that later will not set it, when deserializing, will have it set
        to zero.
[4 May 2010 10:36] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107286

3185 Luis Soares	2010-05-04
      BUG#52166: Seconds_Behind_Master spikes after long idle period
      
      Addressing review comments:
        - renamed the test case: 
          seconds_behind_master -> rpl_seconds_behind_master;
        - more improvements on the comments;
        - removed sleep from the last insert which is no longer needed.
      
      Also, changed remark about the test requiring statement based
      replication.
[30 May 2010 23:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109564

3185 Luis Soares	2010-05-31 [merge]
      BUG#52166: Seconds_Behind_Master spikes after long idle period
      
      Automerged original patch and incremental patches from bug 
      report into latest mysql-5.1-rep+2.
[30 May 2010 23:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109565

3141 Luis Soares	2010-05-31 [merge]
      BUG#52166: Seconds_Behind_Master spikes after long idle period
      
      Automerge from mysql-5.1-rep+2 into mysql-5.1-rep+3.
[30 May 2010 23:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109566

3142 Luis Soares	2010-05-31 [merge]
      BUG#52166: Seconds_Behind_Master spikes after long idle period
      
      Automerge from mysql-5.1-rep+2 into mysql-5.1-rep+3.
[30 May 2010 23:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109567

2981 Luis Soares	2010-05-31 [merge]
      BUG#52166: Seconds_Behind_Master spikes after long idle period
        
      Manual merge from mysql-5.1-rep+2 into mysql-next-mr-rpl-merge.
      
      Conflicts
      =========
        Text conflict in sql/slave.cc
[3 Jun 2010 23:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/110178

3187 Luis Soares	2010-06-04
      BUG#52166: Seconds_Behind_Master spikes after long idle period
      
      Post-push fix.
      
      Fixing valgrind warnings: clock_diff_with_master was not 
      initialized (and this was even the case before patch for BUG#51266).
      
      We fix this by setting clock_diff_with_master to zero on Master_info
      contstructor.
[26 Jun 2010 7:40] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100626073921-t0e6q2c9nkhuemnj) (version source revid:alik@sun.com-20100626073921-t0e6q2c9nkhuemnj) (pib:16)
[4 Aug 2010 8:08] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@ibmvm-20100804080001-bny5271e65xo34ig) (version source revid:alik@sun.com-20100626073921-t0e6q2c9nkhuemnj) (merge vers: 5.6.99-m4) (pib:18)
[4 Aug 2010 8:24] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@ibmvm-20100804081533-c1d3rbipo9e8rt1s) (version source revid:alik@sun.com-20100626073921-t0e6q2c9nkhuemnj) (merge vers: 5.6.99-m4) (pib:18)
[6 Aug 2010 14:27] Jon Stephens
Does not appear in any 5.6 release, thus no 5.6 changelog entry required.

Set NM status, waiting for merges to 5.1 and 5.5.
[16 Aug 2010 6:39] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[2 Sep 2010 11:35] Luis Soares
See also: BUG#56363 and BUG#53167.
[3 Sep 2010 14:29] Jon Stephens
Discussed with Luís on IRC -- no new merges expected, closing.
[13 Nov 2010 16:26] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[15 Nov 2010 12:55] Jon Stephens
Set back to Closed. See previous comments.
[14 Jan 2011 23:48] James Day
There are no plans to fix this in 5.1. If practical and if this bothers you, we suggest that you consider upgrading to MySQL 5.5.

An alternative option with 5.1 is to have some process that does periodic updates to a row in a table that's used just to give the timestamp of the last change on the master. Those updates will cause replication to notice the change and you can check the row you're updating also if you want to. Events are one possible way to arrange this.