Bug #29835 Agent dup uuid errors occur due to network issues
Submitted: 16 Jul 2007 21:23 Modified: 22 Jul 2007 21:42
Reporter: Bill Weber Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Enterprise Monitor: Server Severity:S1 (Critical)
Version:1.2.0.6610 OS:Any
Assigned to: Darren Oldag CPU Architecture:Any

[16 Jul 2007 21:23] Bill Weber
Description:
Due to network issues (ie outages), the agent session times out and asks for a new session id, however if the session id doesn't make it to the client due to the network issues, when the agent pings in with the old uuid you get dup uuid errors.

How to repeat:
-
[18 Jul 2007 1:32] Darren Oldag
Made session handling more resilient to network timeouts.
A session allocation requires an 'acknowledgement' of the 
session back from the agent before it is considered assigned.
Then, normal agent/dashboard communications can occur.

If the session times out, packets from the agent are dropped
until a new session is ACK'd.

Two ACKd allocated sessions are not allowed for the same agent uuid.

---------------
Note, a good way to test this would be to yank a network cord for two minutes, enable a firewall, etc, just so both services stay alive.  Let a network timeout occur, then undo the nasty network (re-enable it).  Then, see if everyone catches back up.
[18 Jul 2007 15:50] Darren Oldag
not in the 1.2 branch yet, but committed to trunk for review.
[19 Jul 2007 21:02] Darren Oldag
pushed into the last build.
[22 Jul 2007 21:42] Andrew Cwik
fixed in 1.2.0.6744
To test this I've had 57 agents running for two days and none of them have stopped prematurely.  The agent log files and catalina.out file don't have any entries referencing a duplicate uuid.