Bug #22047 Time in SHOW PROCESSLIST for SQL thread in replication seems to become negative
Submitted: 6 Sep 2006 9:09 Modified: 14 Apr 2009 16:21
Reporter: Oli Sennhauser Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.1.20, 5.0 BK OS:Linux (linux)
Assigned to: Georgi Kodinov
Triage: Triaged: D4 (Minor)

[6 Sep 2006 9:09] Oli Sennhauser
Description:
Time in SHOW PROCESSLIST for SQL thread in replication seems to become negative

How to repeat:
mysql> show processlist;

+--------+-------------+---------------------------+--------+---------+------------+-----------------------------------------------------------------------+------------------+
| Id     | User        | Host                      | db     | Command | Time       | State                                                                 | Info             |
+--------+-------------+---------------------------+--------+---------+------------+-----------------------------------------------------------------------+------------------+
|      2 | system user |                           | NULL   | Connect |     723995 | Waiting for master to send event                                      | NULL             |
|      3 | system user |                           | NULL   | Connect | 4294967184 | Has read all relay log; waiting for the slave I/O thread to update it | NULL             |
...

This looks like HEX FFFFFF90 or DEC -112

mysql> show status like 'uptime%';
+---------------+--------+
| Variable_name | Value  |
+---------------+--------+
| Uptime        | 723790 |
+---------------+--------+

Suggested fix:
no idea

Seems similar to Bug #5676: http://bugs.mysql.com/bug.php?id=5676

We have simple master slave replication with high load. The only system which is NOT lagging ist the one with the symptoms described in this bug!!! All other masters were lagging some seconds (approx. 70)
[7 Sep 2006 6:32] Sveta Smirnova
Hi Oli,

our manual at http://dev.mysql.com/doc/refman/5.0/en/show-processlist.html says
"#Time

The time in seconds between the start of the statement or command and now."

What is the correlation between uptime and time field in output of SHOW PROCESSLIST?
[7 Sep 2006 8:05] Oli Sennhauser
Hi Sveta

I just wanted to show that the time cannot be more than uptime 723790 s because system is not up more than this amount of seconds... So I guess that no query can run longer than uptime. We have a problem in this area (replication und high load lags and does not recover by itself) and I hope it is somehow related to this...

Regards Oli
[7 Sep 2006 8:53] Sveta Smirnova
Oli, do you issue SHOW PROCESSLIST on the slave?

I so, did you read part of the http://dev.mysql.com/doc/refman/5.0/en/replication-faq.html Jorge pointed:

----<start quote>----
 A: You can read the Seconds_Behind_Master column in SHOW SLAVE STATUS. See Section 6.3, “Replication Implementation Details”.

When the slave SQL thread executes an event read from the master, it modifies its own time to the event timestamp. (This is why TIMESTAMP is well replicated.) In the Time column in the output of SHOW PROCESSLIST, the number of seconds displayed for the slave SQL thread is the number of seconds between the timestamp of the last replicated event and the real time of the slave machine. You can use this to determine the date of the last replicated event. Note that if your slave has been disconnected from the master for one hour, and then reconnects, you may immediately see Time values like 3600 for the slave SQL thread in SHOW PROCESSLIST. This is because the slave is executing statements that are one hour old. 
----<end quote>----

Your case is other? Which value of Seconds_Behind_Master?
[11 Sep 2006 7:13] Oli Sennhauser
Hi Sveta

As you can see from the posted output, the SQL thread is 4 Mia seconds behind the master. This cannot be because the system is running only 724k seconds:

|      2 | system user |                           | NULL   | Connect |    
723995 | Waiting for master to send event                                      |
NULL             |
|      3 | system user |                           | NULL   | Connect |
4294967184 | Has read all relay log; waiting for the slave I/O thread to update
it | NULL             |

Because 4.2 Mia is very close to the max of an int (4 bytes). I assume that we have/had some troubles with a int overflow or negative values

4294967184 looks like HEX FFFFFF90 or DEC -112

In the article you mentioned it is written, that the SQL thread can have a lag of some seconds (e.g. 3600) but no NEGATIVE lag (this would mean that the Slave is IN FRONT of the master which technically is not possible.

Regards Oli
[13 Sep 2006 6:54] Sveta Smirnova
Slave and master are running on the same operation system?
[13 Sep 2006 8:46] Oli Sennhauser
Yes, 64-bit RHEL
[13 Sep 2006 9:16] Sveta Smirnova
Verified as described: "Waiting for master to send event" > uptime on Windows (master) and Solaris (slave). System clocks on these 2 computers are unsynchronized.

To repeat start replication before midnight and issue show processlist;, show status like 'uptime%'; statements after midnight. To see the problem you should wait about 3-4 hours.
[13 Sep 2006 9:20] Oli Sennhauser
I will check at Cust. if they have ntp or similar...
[13 Sep 2006 11:24] Oli Sennhauser
Customer replied: "All servers are synchronized using ntp every 5 minutes."
[4 Jul 2007 9:17] Sveta Smirnova
Bug #29538 was marked as duplicate of this one.
[6 Mar 2009 15:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/68517

2829 Georgi Kodinov	2009-03-06
      Bug #22047: Time in SHOW PROCESSLIST for SQL thread in replication seems to 
      become negative
            
      THD::start_time has a dual meaning : it's either the time since the process
      entered a given state or is the transaction time returned by e.g. NOW().
      This causes problems, as sometimes THD::start_time may be set to a value
      that is correct and needed when used as a base for NOW(), but these times
      may be arbitrary (SET @@timestamp) or non-local (coming from the master 
      through the replication feed).
      If one such non-local time is set there's no way to return a correct value
      for e.g. SHOW PROCESSLIST or SELECT ... FROM INFORMATION_SCHEMA.PROCESSLIST.
      Fixed by introducing a local equivalent of start_time (local_start_time) 
      and using the new member exclusively for timing and reporting execution 
      intervals.
            
      Note that no reliable test suite can be constructed, since it would require
      knowing the local time and can't be achieved by the means of the current test
      suite.
     @ sql/sql_class.cc
        Bug #22047: introduced the local time for measuring time intervals
     @ sql/sql_class.h
        Bug #22047: introduced the local time for measuring time intervals
     @ sql/sql_insert.cc
        Bug #22047: set local time for the delayed insert thread
     @ sql/sql_show.cc
        Bug #22047: use local time in I_S.PROCESSLIST
[12 Mar 2009 14:19] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/69041

3116 Georgi Kodinov	2009-03-12
      Bug #22047: Time in SHOW PROCESSLIST for SQL thread in replication seems to 
            become negative
      
      THD::start_time has a dual meaning : it's either the time since the process
      entered a given state or is the transaction time returned by e.g. NOW().
      This causes problems, as sometimes THD::start_time may be set to a value
      that is correct and needed when used as a base for NOW(), but these times
      may be arbitrary (SET @@timestamp) or non-local (coming from the master 
      through the replication feed).
      If one such non-local time is set there's no way to return a correct value
      for e.g. SHOW PROCESSLIST or SELECT ... FROM INFORMATION_SCHEMA.PROCESSLIST.
      Fixed by making the Time column in SHOW PROCESSLIST SIGNED LONG instead of 
      UNSIGNED LONG and doing the correct conversions.
      
      Note that no reliable test suite can be constructed, since it would require
      knowing the local time and can't be achieved by the means of the current test
      suite.
     @ sql/sql_show.cc
        Bug #22047: make the Time column SIGNED LONG from UNSIGNED LONG
[13 Mar 2009 13:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/69145

2778 Georgi Kodinov	2009-03-13
      Bug #22047 : Time in SHOW PROCESSLIST for SQL thread in replication 
      seems to become negative
      
      THD::start_time has a dual meaning : it's either the time since the process
      entered a given state or is the transaction time returned by e.g. NOW().
      This causes problems, as sometimes THD::start_time may be set to a value
      that is correct and needed when used as a base for NOW(), but these times
      may be arbitrary (SET @@timestamp) or non-local (coming from the master 
      through the replication feed).
      If one such non-local time is set there's no way to return a correct value
      for e.g. SHOW PROCESSLIST or SELECT ... FROM INFORMATION_SCHEMA.PROCESSLIST.
      Fixed by making the Time column in SHOW PROCESSLIST SIGNED LONG instead of 
      UNSIGNED LONG and doing the correct conversions.
            
      Note that no reliable test suite can be constructed, since it would require
      knowing the local time and can't be achieved by the means of the current test
      suite.
     @ sql/sql_show.cc
        Bug #22047: make the Time in SHOW PROCESSLIST LONG from 
        LONG UNSIGNED
[13 Mar 2009 13:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/69146

2774 Georgi Kodinov	2009-03-13
      Bug #22047 : Time in SHOW PROCESSLIST for SQL thread in replication 
      seems to become negative
      
      THD::start_time has a dual meaning : it's either the time since the process
      entered a given state or is the transaction time returned by e.g. NOW().
      This causes problems, as sometimes THD::start_time may be set to a value
      that is correct and needed when used as a base for NOW(), but these times
      may be arbitrary (SET @@timestamp) or non-local (coming from the master 
      through the replication feed).
      If one such non-local time is set there's no way to return a correct value
      for e.g. SHOW PROCESSLIST or SELECT ... FROM INFORMATION_SCHEMA.PROCESSLIST.
      Fixed by making the Time column in SHOW PROCESSLIST SIGNED LONG instead of 
      UNSIGNED LONG and doing the correct conversions.
            
      Note that no reliable test suite can be constructed, since it would require
      knowing the local time and can't be achieved by the means of the current test
      suite.
     @ sql/sql_show.cc
        Bug #22047: make the Time in SHOW PROCESSLIST LONG from 
        LONG UNSIGNED
[13 Mar 2009 14:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/69159

2839 Georgi Kodinov	2009-03-13 [merge]
      Bug #22047: Time in SHOW PROCESSLIST for SQL thread in replication seems to 
      become negative
      
      - merged the fix to 5.1
      - extended to cover I_S.PROCESSLIST.TIME
      - Changed the column type of I_S.PROCESSLIST.TIME from LOGNLONG UNSIGNED
        to LONG (to match the SHOW PROCESSLIST type)
      - Added a test case
[27 Mar 2009 14:32] Bugs System
Pushed into 5.0.80 (revid:joro@sun.com-20090327142516-55gumdxj39z6eijj) (version source revid:leonard@mysql.com-20090324072904-7w2lxdxzw8hx1rnm) (merge vers: 5.0.80) (pib:6)
[27 Mar 2009 14:57] Bugs System
Pushed into 5.1.34 (revid:joro@sun.com-20090327143448-wuuuycetc562ty6o) (version source revid:leonard@mysql.com-20090316090622-sr8lylqvsl1jrcnv) (merge vers: 5.1.34) (pib:6)
[13 Apr 2009 9:21] Bugs System
Pushed into 6.0.11-alpha (revid:alik@sun.com-20090413084402-snnrocwzktcl88ny) (version source revid:joro@sun.com-20090317155017-yldq76zghpxoln3p) (merge vers: 6.0.11-alpha) (pib:6)
[14 Apr 2009 16:21] Paul Dubois
Noted in 5.0.80, 5.1.34, 6.0.11 changelogs.

The Time column for SHOW PROCESSLIST output and the value of the TIME
column of the INFORMATION_SCHEMA.PROCESSLIST table now can have
negative values. Previously, the column was unsigned and negative
values were displayed incorrectly as large positive values. Negative
values can occur if a thread alters the time into the future with SET
TIMESTAMP = value or the thread is executing on a slave and
processing events from a master that has its clock set ahead of the
slave.
[9 May 2009 16:44] Bugs System
Pushed into 5.1.34-ndb-6.2.18 (revid:jonas@mysql.com-20090508185236-p9b3as7qyauybefl) (version source revid:jonas@mysql.com-20090508185236-p9b3as7qyauybefl) (merge vers: 5.1.34-ndb-6.2.18) (pib:6)
[9 May 2009 17:41] Bugs System
Pushed into 5.1.34-ndb-6.3.25 (revid:jonas@mysql.com-20090509063138-1u3q3v09wnn2txyt) (version source revid:jonas@mysql.com-20090509063138-1u3q3v09wnn2txyt) (merge vers: 5.1.34-ndb-6.3.25) (pib:6)
[9 May 2009 18:39] Bugs System
Pushed into 5.1.34-ndb-7.0.6 (revid:jonas@mysql.com-20090509154927-im9a7g846c6u1hzc) (version source revid:jonas@mysql.com-20090509154927-im9a7g846c6u1hzc) (merge vers: 5.1.34-ndb-7.0.6) (pib:6)
[25 Jan 2013 14:35] David Hall
I have a production slave server that has a negative number in Time.  
 version  5.5.16-log on RedHat

Processlist:
  2 system us -19 Connec Slave has read all relay log; waiting for the slave I/O thread to update it

The OS system times match on master and slave.  I checked now()/set timestamp on both mysql instances and they are both the same number.

Everything seems ok but I just want to make sure there's no trouble down the road.