Bug #74415 Advisor doesn't use the correct thresholds
Submitted: 16 Oct 2014 7:34 Modified: 30 Mar 2015 13:57
Reporter: Daniël van Eeden (OCA) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Enterprise Monitor: Advisors/Rules Severity:S3 (Non-critical)
Version:3.0.14 OS:Any
Assigned to: CPU Architecture:Any

[16 Oct 2014 7:34] Daniël van Eeden
Description:
Situation:
Advisor: 'Slave Too Far Behind Master'
Thresholds for the advisor: Default (120, 300, 600)

Then we add a slave which often has a lag of 30m. We override the threshold for this server (MyServerGroup->MyServer (1500, 1800, 2200))

Note that the server is also member of another group, which was created automatically for this replication setup. We don't have overridden the parameters for that group.

The expected critical threshold for this server/advisor combo: 2200
The actual critical threshold (as seen in the notification mails): 600

If I remember correctly it will use the correct threshold for the server if set for all groups in which the server is present (might be the group or the server in that group).

How to repeat:
See description.

Suggested fix:
Use overridden parameters.
[23 Dec 2014 7:43] Daniël van Eeden
Screenshot

Attachment: mem_thresholds.png (image/png, text), 86.68 KiB.

[10 Feb 2015 17:40] MySQL Verification Team
Verified as described. This worked as it should have when I just changed the MEM created replication group to different values, but as soon as I added my own group and changed the values there the alert was triggered at the lowest not overridden levels.
[11 Mar 2015 16:06] MySQL Verification Team
My original advisor configuration had some problems and was not as Daniel had set up. This is NOT verified and I cannot reproduce this with 3.0.19 I am on now either.
[11 Mar 2015 18:19] Daniël van Eeden
To be clear there are:
- Advisor thresholds
- Group thresholds (user defined)
- Group thresholds (replication)
- Server thresholds

I had default advisor thresholds set and tried to override this on a server level. The server thresholds were higher than the advisor thresholds.
[11 Mar 2015 18:23] Daniël van Eeden
@Roger david Nay:
It looks like you created a user defined group and a replication group. Then you did set a lower threshold on the user defined group. This is different than what I did

- I had advisor and server settings.
- I had a higher threshold on the server (which was a slave which was often lagging)
[16 Mar 2015 17:50] MySQL Verification Team
I have my own created replication sever settings at [1200|1500|1800] and the MEM created group disabled. I set the MySQL SQL_delay option to 2000 seconds to simulate the slave falling behind. I see the advisor go "Notice" at 1200 and then progress to "Critical" like I have configured, no alerts at the lower group settings of [120|300|600].

I tried this as well with the MEM created replication group as 'enabled' you have it 'disabled' in your screenshot. With the second group enabled, I do get TWO Alerts one for the values set for each group, one Alert at [120|300|600] and one at [1200|1500|1800]. So for example at 1300 seconds behind, I see a Critical alert (> 600) and a Notice alert (> 1200) for the SAME server. I think that is the correct behaviour as well, setting a server for one group does not override the setings of another group. See the groups and "Override Advisor Configuration" section from http://dev.mysql.com/doc/mysql-monitor/3.0/en/mem-advisors-intro-advisor-page-ref.html

Is it possible that your 'slave1' server is part of another group that has the default settings? The values in your screenshot are huge, maybe you just didn't get there in order to see the second Alert.
[30 Mar 2015 13:57] Daniël van Eeden
> Is it possible that your 'slave1' server is part of another group that has the 
> default settings?
No it's only a member of the groups indicated in the screenshot

> The values in your screenshot are huge, maybe you just didn't get there in 
> order to see the second Alert.
That are the joys of replicating a table w/o primary key on which large deletes are done. (That's fixed now)
[14 Apr 2015 12:04] MySQL Verification Team
I can't repeat this with 3.0.19. Two messages for the same server only if the "Replication1" group is enabled. With the "Replication1" automatic group disabled like in the uploaded image, only 1 alert received at the higher values for the slave server.