Bug #30015 Agent status events don't show up in Events tab
Submitted: 24 Jul 2007 20:34 Modified: 9 Jan 2015 9:59
Reporter: Stefan Hinz Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Enterprise Monitor: Web Severity:S3 (Non-critical)
Version:1.2.0.7481 OS:Any
Assigned to: Assigned Account CPU Architecture:Any

[24 Jul 2007 20:34] Stefan Hinz
Description:
Agent status events don't show up in Events tab. I've tested this for the critical "agent is down" event, but it will likely apply to other agent events, too (if there are any).

How to repeat:
If there's a critical agent status event for a server or a group of servers, click on the red circle in the Heat Chart of the Monitor tab.
In the Events tab, "No events found" is displayed, or (if there is another critical event such as excessive table scans) a different event is displayed.

Suggested fix:
Show agent status events, rather than nothing or a different event.
[13 Sep 2007 20:53] Bill Weber
When you stop an agent, the agent.reachable variable is set to "shutdown". The "Info" threshold for "MySQL Agent Not Reachable" is "shutdown" and so you get an Info Alert. However, when you click on the red dot for "Agent Status" in the Heat Chart, it takes you to the Events tab with the Severity filter set to Critical and therefore you don't see the Info event for that rule. Instead, you see other Critical events, which is why it appears that no agent events show up in the Events tab. In fact, the event is there, it's just filtered. To see the event, click "reset".
[1 Oct 2007 20:58] Joshua Ganderson
So, we have an issue here about conflicting thresholds on the heat chart (which treats this as critical and filters critical) and in the agent status rule (which bill identified as info). My suggestion would be to modify the rule threshold to have this be a critical alert. Passing the buck to Andy.
[13 Nov 2007 1:43] Andy Bang
There are two ways an agent can be "down":

1) The agent process is terminated normally by a user (perhaps because they're bringing the associated mysqld down for maintenance), in which case we get a "shutdown" signal from the agent.  This is considered a "normal" shutdown and generates an "Info" event.

2) It crashes for some reason.  In this case the agent times out and the Merlin server generates a "timed out" event.  This is considered an "abnormal" shutdown and generates a "Critical" event.

However in both cases we show a red dot because the Heat Chart only knows about up or down, and not about "planned down" vs. "unplanned down".

I personally think the basic behavior of the rule is correct but understand that it's confusing.  I can fix this one rule by generating a critical event in both shutdown cases, but I think we still have the same underlying problem for other rules.

This should be triaged by bug council.