Bug #22047 | Time in SHOW PROCESSLIST for SQL thread in replication seems to become negative | ||
---|---|---|---|
Submitted: | 6 Sep 2006 9:09 | Modified: | 14 Apr 2009 16:21 |
Reporter: | Oli Sennhauser | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 4.1.20, 5.0 BK | OS: | Linux (linux) |
Assigned to: | Georgi Kodinov | CPU Architecture: | Any |
[6 Sep 2006 9:09]
Oli Sennhauser
[7 Sep 2006 6:32]
Sveta Smirnova
Hi Oli, our manual at http://dev.mysql.com/doc/refman/5.0/en/show-processlist.html says "#Time The time in seconds between the start of the statement or command and now." What is the correlation between uptime and time field in output of SHOW PROCESSLIST?
[7 Sep 2006 8:05]
Oli Sennhauser
Hi Sveta I just wanted to show that the time cannot be more than uptime 723790 s because system is not up more than this amount of seconds... So I guess that no query can run longer than uptime. We have a problem in this area (replication und high load lags and does not recover by itself) and I hope it is somehow related to this... Regards Oli
[7 Sep 2006 8:53]
Sveta Smirnova
Oli, do you issue SHOW PROCESSLIST on the slave? I so, did you read part of the http://dev.mysql.com/doc/refman/5.0/en/replication-faq.html Jorge pointed: ----<start quote>---- A: You can read the Seconds_Behind_Master column in SHOW SLAVE STATUS. See Section 6.3, “Replication Implementation Details”. When the slave SQL thread executes an event read from the master, it modifies its own time to the event timestamp. (This is why TIMESTAMP is well replicated.) In the Time column in the output of SHOW PROCESSLIST, the number of seconds displayed for the slave SQL thread is the number of seconds between the timestamp of the last replicated event and the real time of the slave machine. You can use this to determine the date of the last replicated event. Note that if your slave has been disconnected from the master for one hour, and then reconnects, you may immediately see Time values like 3600 for the slave SQL thread in SHOW PROCESSLIST. This is because the slave is executing statements that are one hour old. ----<end quote>---- Your case is other? Which value of Seconds_Behind_Master?
[11 Sep 2006 7:13]
Oli Sennhauser
Hi Sveta As you can see from the posted output, the SQL thread is 4 Mia seconds behind the master. This cannot be because the system is running only 724k seconds: | 2 | system user | | NULL | Connect | 723995 | Waiting for master to send event | NULL | | 3 | system user | | NULL | Connect | 4294967184 | Has read all relay log; waiting for the slave I/O thread to update it | NULL | Because 4.2 Mia is very close to the max of an int (4 bytes). I assume that we have/had some troubles with a int overflow or negative values 4294967184 looks like HEX FFFFFF90 or DEC -112 In the article you mentioned it is written, that the SQL thread can have a lag of some seconds (e.g. 3600) but no NEGATIVE lag (this would mean that the Slave is IN FRONT of the master which technically is not possible. Regards Oli
[13 Sep 2006 6:54]
Sveta Smirnova
Slave and master are running on the same operation system?
[13 Sep 2006 8:46]
Oli Sennhauser
Yes, 64-bit RHEL
[13 Sep 2006 9:16]
Sveta Smirnova
Verified as described: "Waiting for master to send event" > uptime on Windows (master) and Solaris (slave). System clocks on these 2 computers are unsynchronized. To repeat start replication before midnight and issue show processlist;, show status like 'uptime%'; statements after midnight. To see the problem you should wait about 3-4 hours.
[13 Sep 2006 9:20]
Oli Sennhauser
I will check at Cust. if they have ntp or similar...
[13 Sep 2006 11:24]
Oli Sennhauser
Customer replied: "All servers are synchronized using ntp every 5 minutes."
[4 Jul 2007 9:17]
Sveta Smirnova
Bug #29538 was marked as duplicate of this one.
[6 Mar 2009 15:31]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/68517 2829 Georgi Kodinov 2009-03-06 Bug #22047: Time in SHOW PROCESSLIST for SQL thread in replication seems to become negative THD::start_time has a dual meaning : it's either the time since the process entered a given state or is the transaction time returned by e.g. NOW(). This causes problems, as sometimes THD::start_time may be set to a value that is correct and needed when used as a base for NOW(), but these times may be arbitrary (SET @@timestamp) or non-local (coming from the master through the replication feed). If one such non-local time is set there's no way to return a correct value for e.g. SHOW PROCESSLIST or SELECT ... FROM INFORMATION_SCHEMA.PROCESSLIST. Fixed by introducing a local equivalent of start_time (local_start_time) and using the new member exclusively for timing and reporting execution intervals. Note that no reliable test suite can be constructed, since it would require knowing the local time and can't be achieved by the means of the current test suite. @ sql/sql_class.cc Bug #22047: introduced the local time for measuring time intervals @ sql/sql_class.h Bug #22047: introduced the local time for measuring time intervals @ sql/sql_insert.cc Bug #22047: set local time for the delayed insert thread @ sql/sql_show.cc Bug #22047: use local time in I_S.PROCESSLIST
[12 Mar 2009 14:19]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/69041 3116 Georgi Kodinov 2009-03-12 Bug #22047: Time in SHOW PROCESSLIST for SQL thread in replication seems to become negative THD::start_time has a dual meaning : it's either the time since the process entered a given state or is the transaction time returned by e.g. NOW(). This causes problems, as sometimes THD::start_time may be set to a value that is correct and needed when used as a base for NOW(), but these times may be arbitrary (SET @@timestamp) or non-local (coming from the master through the replication feed). If one such non-local time is set there's no way to return a correct value for e.g. SHOW PROCESSLIST or SELECT ... FROM INFORMATION_SCHEMA.PROCESSLIST. Fixed by making the Time column in SHOW PROCESSLIST SIGNED LONG instead of UNSIGNED LONG and doing the correct conversions. Note that no reliable test suite can be constructed, since it would require knowing the local time and can't be achieved by the means of the current test suite. @ sql/sql_show.cc Bug #22047: make the Time column SIGNED LONG from UNSIGNED LONG
[13 Mar 2009 13:52]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/69145 2778 Georgi Kodinov 2009-03-13 Bug #22047 : Time in SHOW PROCESSLIST for SQL thread in replication seems to become negative THD::start_time has a dual meaning : it's either the time since the process entered a given state or is the transaction time returned by e.g. NOW(). This causes problems, as sometimes THD::start_time may be set to a value that is correct and needed when used as a base for NOW(), but these times may be arbitrary (SET @@timestamp) or non-local (coming from the master through the replication feed). If one such non-local time is set there's no way to return a correct value for e.g. SHOW PROCESSLIST or SELECT ... FROM INFORMATION_SCHEMA.PROCESSLIST. Fixed by making the Time column in SHOW PROCESSLIST SIGNED LONG instead of UNSIGNED LONG and doing the correct conversions. Note that no reliable test suite can be constructed, since it would require knowing the local time and can't be achieved by the means of the current test suite. @ sql/sql_show.cc Bug #22047: make the Time in SHOW PROCESSLIST LONG from LONG UNSIGNED
[13 Mar 2009 13:53]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/69146 2774 Georgi Kodinov 2009-03-13 Bug #22047 : Time in SHOW PROCESSLIST for SQL thread in replication seems to become negative THD::start_time has a dual meaning : it's either the time since the process entered a given state or is the transaction time returned by e.g. NOW(). This causes problems, as sometimes THD::start_time may be set to a value that is correct and needed when used as a base for NOW(), but these times may be arbitrary (SET @@timestamp) or non-local (coming from the master through the replication feed). If one such non-local time is set there's no way to return a correct value for e.g. SHOW PROCESSLIST or SELECT ... FROM INFORMATION_SCHEMA.PROCESSLIST. Fixed by making the Time column in SHOW PROCESSLIST SIGNED LONG instead of UNSIGNED LONG and doing the correct conversions. Note that no reliable test suite can be constructed, since it would require knowing the local time and can't be achieved by the means of the current test suite. @ sql/sql_show.cc Bug #22047: make the Time in SHOW PROCESSLIST LONG from LONG UNSIGNED
[13 Mar 2009 14:27]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/69159 2839 Georgi Kodinov 2009-03-13 [merge] Bug #22047: Time in SHOW PROCESSLIST for SQL thread in replication seems to become negative - merged the fix to 5.1 - extended to cover I_S.PROCESSLIST.TIME - Changed the column type of I_S.PROCESSLIST.TIME from LOGNLONG UNSIGNED to LONG (to match the SHOW PROCESSLIST type) - Added a test case
[27 Mar 2009 14:32]
Bugs System
Pushed into 5.0.80 (revid:joro@sun.com-20090327142516-55gumdxj39z6eijj) (version source revid:leonard@mysql.com-20090324072904-7w2lxdxzw8hx1rnm) (merge vers: 5.0.80) (pib:6)
[27 Mar 2009 14:57]
Bugs System
Pushed into 5.1.34 (revid:joro@sun.com-20090327143448-wuuuycetc562ty6o) (version source revid:leonard@mysql.com-20090316090622-sr8lylqvsl1jrcnv) (merge vers: 5.1.34) (pib:6)
[13 Apr 2009 9:21]
Bugs System
Pushed into 6.0.11-alpha (revid:alik@sun.com-20090413084402-snnrocwzktcl88ny) (version source revid:joro@sun.com-20090317155017-yldq76zghpxoln3p) (merge vers: 6.0.11-alpha) (pib:6)
[14 Apr 2009 16:21]
Paul DuBois
Noted in 5.0.80, 5.1.34, 6.0.11 changelogs. The Time column for SHOW PROCESSLIST output and the value of the TIME column of the INFORMATION_SCHEMA.PROCESSLIST table now can have negative values. Previously, the column was unsigned and negative values were displayed incorrectly as large positive values. Negative values can occur if a thread alters the time into the future with SET TIMESTAMP = value or the thread is executing on a slave and processing events from a master that has its clock set ahead of the slave.
[9 May 2009 16:44]
Bugs System
Pushed into 5.1.34-ndb-6.2.18 (revid:jonas@mysql.com-20090508185236-p9b3as7qyauybefl) (version source revid:jonas@mysql.com-20090508185236-p9b3as7qyauybefl) (merge vers: 5.1.34-ndb-6.2.18) (pib:6)
[9 May 2009 17:41]
Bugs System
Pushed into 5.1.34-ndb-6.3.25 (revid:jonas@mysql.com-20090509063138-1u3q3v09wnn2txyt) (version source revid:jonas@mysql.com-20090509063138-1u3q3v09wnn2txyt) (merge vers: 5.1.34-ndb-6.3.25) (pib:6)
[9 May 2009 18:39]
Bugs System
Pushed into 5.1.34-ndb-7.0.6 (revid:jonas@mysql.com-20090509154927-im9a7g846c6u1hzc) (version source revid:jonas@mysql.com-20090509154927-im9a7g846c6u1hzc) (merge vers: 5.1.34-ndb-7.0.6) (pib:6)
[25 Jan 2013 14:35]
David Hall
I have a production slave server that has a negative number in Time. version 5.5.16-log on RedHat Processlist: 2 system us -19 Connec Slave has read all relay log; waiting for the slave I/O thread to update it The OS system times match on master and slave. I checked now()/set timestamp on both mysql instances and they are both the same number. Everything seems ok but I just want to make sure there's no trouble down the road.