MySQL Bugs: #23641: False Positive "Server Not Reachable"

Bug #23641	False Positive "Server Not Reachable"
Submitted:	25 Oct 2006 18:00	Modified:	5 Feb 2007 20:38
Reporter:	Sheeri Cabral (Candidate Quality Contributor)	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Enterprise Monitor	Severity:	S1 (Critical)
Version:	mysqlnetwork-0.6.33	OS:	Linux (Fedora Core 3 (i386))
Assigned to:	Jan Kneschke	CPU Architecture:	Any
Tags:	heat chart, mer100 readme, up/down status

Description:
I'm getting a lot of false positives for "MySQL Server Not Reachable".  I have set the check to a default frequency of 5 minutes (we have nagios currently monitoring it, and it monitors every 5 minutes so I'm not too worried about that), and I'll see if I still get that.  I'd like to help debug that if possible.  It would be nice if the actual error message was displayed -- it's intermittent, so it's not ACLs, but it would be nice to see if it were "too many connections" or something (I know, there's a connections check, but that didn't pop up, and as far as i know these servers are fine....).  is that in a log somewhere? 

How to repeat:
connect to a server and watch it go....I will note that our servers are pretty high performance so the server may have to have some load on it to show the error....

If the mysqld instance the agent is monitoring is under heavy load it is possible for it to get a wait_timeout when trying to connect to check the health of the mysqld instance.  A workaround is to increase the wait_timeout.  The agent needs to  retry the connection multiple times before reporting that the mysqld instance is down.

Increasing severity/priority.

r4356 - committed a fix into the main trunk for testing... once blessed by QA let's get this patch to our internal and external customers that can reliably reproduce this problem