Bug #29387 Unguarded rules deliver nonsensical warnings
Submitted: 27 Jun 2007 13:28 Modified: 10 Jul 2007 12:43
Reporter: Kristian Koehntopp Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Enterprise Monitor: Advisors/Rules Severity:S3 (Non-critical)
Version:1.1.0 OS:Any
Assigned to: Andy Bang CPU Architecture:Any

[27 Jun 2007 13:28] Kristian Koehntopp
Description:
WARNING Alert - High Number Of Attempted User Connections Have Failed  (v 1.5 *)

Expression
(100*(%Aborted_connects% / %Connections%)) > THRESHOLD

Evaluated Expression
(100*(1 / 2)) > 30

Thresholds
Critical Alert = 50
Warning Alert = 30
Info Alert = 10

Two cases here:

1. I want to know of ANY aborted connect. For this, the formula seen is incorrect.

2. I want to know of excessive aborted_connects. That this, the formula needs a absolute value guard.

For tens of percentages, %connections% needs to be at least > 10, for proper percentages, it needs to be at least > 100. Otherwise we are measuring coarse steps.

How to repeat:
Provoke a single failed connect out of 2 connects in total. Get a warning.

Suggested fix:
( %Connections% > 100 ) && (100*(%Aborted_connects% / %Connections%)) > THRESHOLD

or similar. Also, for all other rules that measure relations.
[27 Jun 2007 13:30] Mark Leith
Agree that we should put a guard somewhere, and an additional rule on 'any'.
[6 Jul 2007 18:12] Andy Bang
[10:24] <Leith|Lunch> just 'aborted... > THRESHOLD'
[10:26] <andy> Leith|Lunch: right.  and what do you suggest for THRESHOLD?  it's a counter i think
[10:27] <Leith|Lunch> 1 / 5 / 10?
[10:27] <andy> running every 5 minutes?  10 minutes?
[10:28] <Leith|Lunch> every 5 would be fin
[10:28] <Leith|Lunch> e
[10:29] <Leith|Lunch> actually I would probably say 1 / 10 / 50
[10:29] <Leith|Lunch> 'just tell me as info' for the one, and then getting to 'ok you really should be looking at this' for 50 (i.e ~10 per minute)
[6 Jul 2007 23:04] Andy Bang
Added guard to existing rule per Kris' suggestion.
Added new rule suggested by Kris.  Note that the new rule is in the Testing JAR until QA says it's OK.

Committed revision 6468.
[10 Jul 2007 12:43] Mark Leith
Verified fixed in 1.2.0.6501