Bug #23641 False Positive "Server Not Reachable"
Submitted: 25 Oct 2006 18:00 Modified: 5 Feb 2007 20:38
Reporter: Sheeri Cabral (Candidate Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Enterprise Monitor Severity:S1 (Critical)
Version:mysqlnetwork-0.6.33 OS:Linux (Fedora Core 3 (i386))
Assigned to: Jan Kneschke CPU Architecture:Any
Tags: heat chart, mer100 readme, up/down status

[25 Oct 2006 18:00] Sheeri Cabral
Description:
I'm getting a lot of false positives for "MySQL Server Not Reachable".  I have set the check to a default frequency of 5 minutes (we have nagios currently monitoring it, and it monitors every 5 minutes so I'm not too worried about that), and I'll see if I still get that.  I'd like to help debug that if possible.  It would be nice if the actual error message was displayed -- it's intermittent, so it's not ACLs, but it would be nice to see if it were "too many connections" or something (I know, there's a connections check, but that didn't pop up, and as far as i know these servers are fine....).  is that in a log somewhere? 

How to repeat:
connect to a server and watch it go....I will note that our servers are pretty high performance so the server may have to have some load on it to show the error....
[16 Nov 2006 15:28] Sloan Childers
If the mysqld instance the agent is monitoring is under heavy load it is possible for it to get a wait_timeout when trying to connect to check the health of the mysqld instance.  A workaround is to increase the wait_timeout.  The agent needs to  retry the connection multiple times before reporting that the mysqld instance is down.
[16 Nov 2006 15:55] Sloan Childers
Increasing severity/priority.
[29 Nov 2006 17:50] Sloan Childers
r4356 - committed a fix into the main trunk for testing... once blessed by QA let's get this patch to our internal and external customers that can reliably reproduce this problem