Bug #50806 Proxy does not forward queries if clock set back after mysqld is down
Submitted: 1 Feb 2010 21:53 Modified: 25 Mar 2010 15:41
Reporter: Diego Medina Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Enterprise Monitor: Agent Severity:S1 (Critical)
Version:2.1.1.1138 OS:Any
Assigned to: Michael Schuster CPU Architecture:Any

[1 Feb 2010 21:53] Diego Medina
Description:
Your agent and mysqld are running all nicely, then the mysqld crashes, all
queries that go through the proxy port are refused. Your system then changes the time to one hour back. The mysqld server comes back online, but the proxy is still not forwarding queries to the backend.

This will go on until you fix your system;s clock or until the time goes to one hour later.

How to repeat:
1- Start the agent, mysqld and dashboard
2- Once they are all working
3- Stop the mysqld
4- Send some queries through the proxy port 
5- change the  clock on the agent box (they can all be on the same box) to one hour back
6- start the mysqld server
7- try to send queries through the proxy port, no luck, you will get

"ERROR 1105 (00000): #07000(proxy) all backends are down"

8- Set the clock back to the normal time and all goes back to normal.
[3 Feb 2010 6:57] Enterprise Tools JIRA Robot
Kay Roepke writes: 
The problem is in function {{network_backends_check()}} in network-backend.c called from proxy-plugin.c:1494.

We need a check for a negative time difference (also for the connection, although that's not the problem here) and then reset all previously recorded timestamps.
My suggestion is to move all timestamp-related functions into a helper function that performs this check.
It should have two parameters: the previous recorded value (or 0 as 'unknown time' for the first time it's called) and the location of where to store the new timestamp (like g_get_current_time()).
If the previous value is not 0 and the difference {{new - old}} is negative it should log a critical message and reset the timestamp to 0 (which should always mean "never seen").
[3 Mar 2010 20:18] Enterprise Tools JIRA Robot
Keith Russell writes: 
Patch installed in versions => 2.2.0.1638.
[12 Mar 2010 11:49] Enterprise Tools JIRA Robot
Diego Medina writes: 
Now it is even worse, After you set the  clock back one hour and start the mysqld, the agent does not try to reconnect to the mysqld, you can send queries, but the agent shows the server as down on the dashboard.
[12 Mar 2010 12:23] Enterprise Tools JIRA Robot
Diego Medina writes: 
2010-03-12 05:45:03: (critical) agent_heartbeat.c:64: This system's clock has changed -3614.1669 seconds between two heartbeats, exceeding the threshold of 12 seconds.
Check your system clock for excessive jitter. This issue affects accurate reporting of data.

<here I changed the clock back to normal>

2010-03-12 07:21:57: (critical) agent_heartbeat.c:64: This system's clock has changed 3600.5998 seconds between two heartbeats, exceeding the threshold of 12 seconds.
Check your system clock for excessive jitter. This issue affects accurate reporting of data.
2010-03-12 07:21:57: (critical) agent_mysqld.c:707: successfully connected to database at 127.0.0.1:5132 as user msandbox (with password: YES)
[22 Mar 2010 14:53] Enterprise Tools JIRA Robot
Diego Medina writes: 
Verified fixed on 2.2.0.1638  (we now forward the queries)
[22 Mar 2010 15:00] Enterprise Tools JIRA Robot
Diego Medina writes: 
We need a 2.1.1 build, the bug was fixed on rev 1557, but the latest 2.1.2.1150 is up to rev 1555
[22 Mar 2010 17:26] Enterprise Tools JIRA Robot
Keith Russell writes: 
Patch installed in versions => 2.1.2.1160.
[23 Mar 2010 14:58] Enterprise Tools JIRA Robot
Diego Medina writes: 
Verified fixed on 2.1.2.1160
[25 Mar 2010 15:41] MC Brown
A note has been added to the 2.1.2 and 2.2.0 changelogs: 

        When using the &merlin_proxy;, if the backend MySQL server                                                                                         
        went down, and then the clock on the &merlin_proxy; host went                                                                                      
        back in time (for example, during daylight savings time                                                                                            
        adjustments), the &merlin_proxy; would stop sending queries to                                                                                     
        the configured backend.