| Bug #62839 | race in computing Seconds_Behind_Master overstates replication lag | ||
|---|---|---|---|
| Submitted: | 20 Oct 2011 5:51 | Modified: | 12 Nov 2011 4:14 |
| Reporter: | Mark Callaghan | Email Updates: | |
| Status: | Duplicate | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
| Version: | 5.1 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | replication, show, slave, STATUS | ||
[20 Oct 2011 14:09]
Mark Callaghan
Filing a bug past my bedtime might explain this, but it isn't obvious now how the race would explain the result jumping to 45. It would explain a jump to a value near the current unix timestamp.
time(0) - mi->clock_diff_with_master
[20 Oct 2011 15:07]
Mark Callaghan
Looking at this again data_lock might prevent the race. Although the lack of docs for any of the RPL mutexes makes this code difficult to read:
/*
standard lock acquistion order to avoid deadlocks:
run_lock, data_lock, relay_log.LOCK_log, relay_log.LOCK_index
*/
pthread_mutex_t data_lock,run_lock;
and
pthread_mutex_t data_lock,run_lock;
Regardless, I still get unexplained jumps in replication lag and I think part of the problem is that SHOW SLAVE STATUS returns incorrect values sometimes.
[20 Oct 2011 15:41]
Mark Callaghan
My current theory is that this is expected behavior on a slave that has lag because it does keep up with the master and last_master_timestamp is reset to 0 while the SQL thread waits. This has confused me before and will confuse me again. I thought I added a SHOW STATUS variable to the FB patch for reporting the lag using a counter that isn't reset in that case, but I cannot find it.
[22 Oct 2011 8:47]
MySQL Verification Team
This seems to be a duplicate of bug #52166.
[22 Oct 2011 9:04]
Valeriy Kravchuk
Indeed, looks similar to that bug Mikiya mentioned. No plans to fix it in 5.1 :(
[22 Oct 2011 14:28]
Mark Callaghan
My servers aren't idle. This should be closed as it isn't a bug. It is a result of the design -- SBM is reset to zero whenever the SQL thread catches the IO thread. If that server really has lag and the IO thread can't keep up with the master then SBM flips between zero (when SQL thread catches IO thread) and the real lag.
[22 Oct 2011 14:30]
Mark Callaghan
I published an option to change mysqld to not reset SBM when the SQL thread catches the IO thread. For people with non-idle servers I think this is a good thing -- http://bazaar.launchpad.net/~mysqlatfacebook/mysqlatfacebook/5.1/revision/3753
[12 Nov 2011 4:05]
Trent Lloyd
Mark - I guess the situation would be better summed up like this: Seconds_behind_master should be named more like Seconds_behind_last_master_event_received. And I guess your modified version is like a Seconds_since_last_master_event_executed. To more accurately report Seconds_behind_master you would need to somehow communicate more often with the master to determine it's actual master log position using some second in band channel or header - or a separate connection. The design is faulty as you say, your solution prefers very active servers, the default solution prefers more idle ones. The best solution would require more major changes.
[12 Nov 2011 4:14]
Trent Lloyd
This is a duplicate of Bug #59598

Description: Note that replication lag is something that has made many of my users unhappy with MySQL, so reporting it incorrectly is really bad PR for MySQL. I am sure other deployments feel the same way. I am trying to figure out why a server has intermittent spikes in replication lag. In one case it goes 0 -> 45 -> 0 over a period of a few seconds. I compared consecutive SET TIMESTAMP values from the binlog and the difference is never greater than 2 seconds. Statements replayed during replication don't run for more than one or two seconds. I think the problem is a race condition in this code. If last_master_timestamp is 0 when time_diff is computed and non-zero when protocol->store(..) is reached, then this returns the following as the lag: time(0) - mi->clock_diff_with_master And that can explain the bogus jumps. Now I must explain why this can happen. There are two places where it will be zero: 1) last_master_timestamp is zero before the first binlog event has been completed and possibly after a reconnect 2) last_master_timestamp is temporarily set to zero when the SQL thread catches up So if the read of last_master_timestamp to compute time_diff is done during #1 or #2 above and then last_master_timestamp is set to a non-zero value before protocol->store(...) is called, then I think I have explained this. /* Seconds_Behind_Master: if SQL thread is running and I/O thread is connected, we can compute it otherwise show NULL (i.e. unknown). */ if ((mi->slave_running == MYSQL_SLAVE_RUN_CONNECT) && mi->rli.slave_running) { long time_diff= ((long)(time(0) - mi->rli.last_master_timestamp) - mi->clock_diff_with_master); /* Apparently on some systems time_diff can be <0. Here are possible reasons related to MySQL: - the master is itself a slave of another master whose time is ahead. - somebody used an explicit SET TIMESTAMP on the master. Possible reason related to granularity-to-second of time functions (nothing to do with MySQL), which can explain a value of -1: assume the master's and slave's time are perfectly synchronized, and that at slave's connection time, when the master's timestamp is read, it is at the very end of second 1, and (a very short time later) when the slave's timestamp is read it is at the very beginning of second 2. Then the recorded value for master is 1 and the recorded value for slave is 2. At SHOW SLAVE STATUS time, assume that the difference between timestamp of slave and rli->last_master_timestamp is 0 (i.e. they are in the same second), then we get 0-(2-1)=-1 as a result. This confuses users, so we don't go below 0: hence the max(). last_master_timestamp == 0 (an "impossible" timestamp 1970) is a special marker to say "consider we have caught up". */ protocol->store((longlong)(mi->rli.last_master_timestamp ? max(0, time_diff) : 0)); How to repeat: Read above Suggested fix: assign last_master_timestamp to a local variable