Bug #2826 Seconds behind master weirdness
Submitted: 16 Feb 2004 8:28 Modified: 10 Jan 2005 9:46
Reporter: Donny Simonton Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:4.1.1 OS:Linux (Linux)
Assigned to: Guilhem Bichot CPU Architecture:Any

[16 Feb 2004 8:28] Donny Simonton
Description:
I've noticed this a few times, so I just thought I would pass it along.  Sometimes when I do a show slave status \G; from the mysql command line, I get stuff like this:

mysql> show slave status \G;
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 192.168.10.195
                Master_User: slaveuser
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: parking-master-bin.000083
        Read_Master_Log_Pos: 536165643
             Relay_Log_File: parking-slave-relay-bin.000060
              Relay_Log_Pos: 536166001
      Relay_Master_Log_File: parking-master-bin.000083
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB: 
        Replicate_Ignore_DB: 
         Replicate_Do_Table: 
     Replicate_Ignore_Table: 
    Replicate_Wild_Do_Table: 
Replicate_Wild_Ignore_Table: 
                 Last_Errno: 0
                 Last_Error: 
               Skip_Counter: 0
        Exec_Master_Log_Pos: 536165643
            Relay_Log_Space: 536166001
            Until_Condition: None
             Until_Log_File: 
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File: 
         Master_SSL_CA_Path: 
            Master_SSL_Cert: 
          Master_SSL_Cipher: 
             Master_SSL_Key: 
      Seconds_Behind_Master: 18446744073709551615
1 row in set (0.00 sec)

ERROR: 
No query specified

As you can see the Seconds behind master is Seconds_Behind_Master: 18446744073709551615.  But then I run it again a second later and it's fine like below.

mysql> show slave status \G;
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 192.168.10.195
                Master_User: slaveuser
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: parking-master-bin.000083
        Read_Master_Log_Pos: 536177839
             Relay_Log_File: parking-slave-relay-bin.000060
              Relay_Log_Pos: 536172177
      Relay_Master_Log_File: parking-master-bin.000083
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB: 
        Replicate_Ignore_DB: 
         Replicate_Do_Table: 
     Replicate_Ignore_Table: 
    Replicate_Wild_Do_Table: 
Replicate_Wild_Ignore_Table: 
                 Last_Errno: 0
                 Last_Error: 
               Skip_Counter: 0
        Exec_Master_Log_Pos: 536171819
            Relay_Log_Space: 536178197
            Until_Condition: None
             Until_Log_File: 
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File: 
         Master_SSL_CA_Path: 
            Master_SSL_Cert: 
          Master_SSL_Cipher: 
             Master_SSL_Key: 
      Seconds_Behind_Master: 0
1 row in set (0.00 sec)

ERROR: 
No query specified

How to repeat:
Run show slave status \G a few times.
[16 Feb 2004 9:02] Guilhem Bichot
Hi!

Thanks for reporting this, it looks indeed like a solid confirmation of suspicions we already had (possibly you may also experience strange values in the 'Time' column in 'SHOW PROCESSLIST' for the slave SQL thread).
I have to investigate this (once more... <sigh>).
[1 Mar 2004 7:42] Guilhem Bichot
Hi,
After checking code again, I could not find any problem. Here's how Seconds_behind_master is computed.
* When the slave connects to the master (when the slave I/O thread connects to the master), it does a= SELECT UNIX_TIMESTAMP() on master, and b= SELECT UNIX_TIMESTAMP() on slave, and computes c=b-a.
* Every statement executed on the master is stored on the master's binary log together with the timestamp when it was executed.
* So when the slave SQL thread reads a statement from the binary log, it executes it and then it does:
seconds_behind_master= SELECT UNIX_TIMESTAMP() on slave
                       - timestamp of the master's binary log event
                       + c
The 18446744073709551615 you got means that the above computation resulted in -1 ( (ulonglong)(-1) = 18....1615).
Assuming c=0 (just to simplify, changes nothing to the reasoning), and assuming the master's and slave's clock stay perfectly synchronized, then seconds_behind_master can't be -1: the moment when the statement is executed on master is before the moment when it is replicated on the slave.
How are the clocks of your 2 machines synchronized ?
[1 Mar 2004 7:45] Guilhem Bichot
correction: I meant
seconds_behind_master= SELECT UNIX_TIMESTAMP() on slave
                       - timestamp of the master's binary log event
                       - c
[2 Mar 2004 7:26] Guilhem Bichot
Hi!
My list of questions:
- How are the clocks of your 2 machines synchronized ?
- Is the master itself a slave of another MySQL server ?
- Does somebody explicitely set the value of TIMESTAMP on the master (using:
SET TIMESTAMP=... before issuing statements containing NOW(), UNIX_TIMESTAMP() etc) ?
[2 Mar 2004 7:40] Donny Simonton
- How are the clocks of your 2 machines synchronized ? We have an internal timeserver that updates the time on all of our servers 4 times a day.  So all of the servers should always be VERY close in time.
- Is the master itself a slave of another MySQL server ?
No.  
- Does somebody explicitely set the value of TIMESTAMP on the master (using:
SET TIMESTAMP=... before issuing statements containing NOW(), UNIX_TIMESTAMP()
etc) ?
No, we don't do set timestamp.

Not sure if this really, helped.  I'd be happy if you just set -1 as the time behind master instead of the crazy number.
[2 Mar 2004 14:08] Guilhem Bichot
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

A possible reason why you get -1 is related to the granularity-to-second of time functions:
assume the master's and slave's time are perfectly synchronized. When the slave connects to the master, it computes the clocks difference (which is exactly 0, but read further): when the master's timestamp is read, assume both clocks are at the very end of second 1, and (a very short time later) when the slave's timestamp is read they are at the very beginning of second 2. Then the rounded value for master is 1 and the rounded value for slave is 2. So the clock difference is said to be 1, which can explain the later -1 in SHOW SLAVE STATUS.
To work around all this, I have changed (almost as you suggested) Seconds_Behind_Master of MySQL 4.1.2 to display:
MAX( 0 , Seconds_Behind_Master_as_it_is_in_4.1.1 )
(I told myself that 0 was even less confusing than -1).
So you should not get weird values anymore.
[15 Dec 2004 23:38] Guilhem Bichot
Hello Donny,
did you get a chance to upgrade your slave to a more recent 4.1 version? If yes do you still have the "Seconds_Behind_Master mega number" weirdness?
Thanks!
[10 Jan 2005 9:46] Guilhem Bichot
Donny confirms that it works with a more recent 4.1.