Description:
In our development environment we run merlin with a single merlin agent monitoring more than one mysqld instance.
this is configured as follows:
[root@dd01mvb-03 agent-dev]# pwd
/opt/mysql/enterprise/agent-dev
[root@dd01mvb-03 agent-dev]# cd etc/instances/
[root@dd01mvb-03 instances]# ls -la
total 16
drwxr-xr-x 4 root root 4096 Sep 16 10:43 .
drwxrwxr-x 5 root root 4096 Sep 16 10:43 ..
drwxr-xr-x 2 root root 4096 May 12 14:19 avrdbprod-vip3
drwxr-xr-x 2 root root 4096 May 8 16:13 d-merlin-vip
[root@dd01mvb-03 instances]# find . -type f
./avrdbprod-vip3/agent-instance.ini
./d-merlin-vip/agent-instance.ini
[root@dd01mvb-03 instances]#
Each file is configured to talk to a different instance:
e.g. etc/d-merlin-vip/agent-instance.ini
[mysqld]
displayname = d-merlin-vip
user = merlin_agent_dev
password = XXXXXXX
hostname = d-merlin-vip
port = 3306
On one box we have been running 2 5.1 test dev mysqld servers and also the dev merlin server (tomcat+mysqld)
Tomcat was configured in config.properties to talk to mysql via the agent's proxy:
#SymmetricKey was auto generated.
#Wed Apr 15 16:52:02 CEST 2009
mysql.user=service_manager
#mysql.port=3306
mysql.port=4041
key=XXXXXXXXXXXXXXXX
mysql.pass=XXXXXXX
mysql.server=d-merlin-vip.dqs.lhr1.bbb.com
mysql.db=mem
quanal.collect=00\:01\:00
default.maxActive=70
Note: the connection to the proxy port 4041 (4040 is used by production merlin agent)
The proxy configuration is as follows:
[root@dd01mvb-03 etc]# cat mysql-monitor-agent.ini
[mysql-proxy]
keepalive = true
plugins=proxy,agent
agent-mgmt-hostname = https://merlin_agent:XXXXXXXX@d-merlin-vip.dqs.lhr1.bbb.com:443/heartbeat
mysqld-instance-dir= etc/instances
agent-item-files = share/mysql-proxy/items/quan.lua,share/mysql-proxy/items/items-mysql-monitor.xml,share/mysql-proxy/items/agent-allocation-stats.lua
proxy-address=:4041
proxy-backend-addresses = d-merlin-vip.dqs.lhr1.bbb.com:3306
proxy-lua-script = share/mysql-proxy/quan.lua
agent-uuid = d25901f9-ea48-4c50-8d57-95c81ad90cd1
log-file = mysql-monitor-agent.log
pid-file=/opt/mysql/enterprise/agent-dev/mysql-monitor-agent.pid
I noticed that we had some problems talking to the dev merlin server, so restarted tomcat.
It gave error messages as follows:
[root@dd01mvb-03 logs]# /opt/mysql/enterprise/monitor/mysqlmonitorctl.sh start tomcat
Using CATALINA_BASE: /opt/mysql/enterprise/monitor/apache-tomcat
Using CATALINA_HOME: /opt/mysql/enterprise/monitor/apache-tomcat
Using CATALINA_TMPDIR: /opt/mysql/enterprise/monitor/apache-tomcat/temp
Using JRE_HOME: /opt/mysql/enterprise/monitor/java
[root@dd01mvb-03 logs]#
==> catalina.out <==
Sep 16, 2009 10:19:16 AM org.apache.catalina.core.AprLifecycleListener init
INFO: The Apache Tomcat Native library which allows optimal performance in production environments was not found on the java.library.path:
/opt/mysql/enterprise/monitor/java/lib/amd64/server:/opt/mysql/enterprise/monitor/java/lib/amd64:/opt/mysql/enterprise/monitor/java/../lib/amd64:/usr/java/packages/lib/amd64:/
lib:/usr/lib
Sep 16, 2009 10:19:16 AM org.apache.coyote.http11.Http11Protocol init
INFO: Initializing Coyote HTTP/1.1 on http-80
Sep 16, 2009 10:19:16 AM org.apache.coyote.http11.Http11Protocol init
INFO: Initializing Coyote HTTP/1.1 on http-443
Sep 16, 2009 10:19:16 AM org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 999 ms
Sep 16, 2009 10:19:16 AM org.apache.catalina.core.StandardService start
INFO: Starting service Catalina
Sep 16, 2009 10:19:16 AM org.apache.catalina.core.StandardEngine start
INFO: Starting Servlet Engine: Apache Tomcat/6.0.14
Sep 16, 2009 10:19:17 AM org.apache.catalina.core.ApplicationContext log
INFO: Initializing Spring root WebApplicationContext
==> mysql-monitor.log <==
2009-09-16 10:19:18,478 INFO [main:com.mysql.graph] Using stream+ssps for graph data query
2009-09-16 10:19:18,497 INFO [main:com.mysql.graph] Using com.mysql.etools.monitor.bo.ServerVdmGraphReport for graph data
2009-09-16 10:19:19,395 WARN [main:com.mysql.sql] java.lang.Exception: MySQL server not running or accepting connections, retrying 49 times or 179 seconds, whichever expires
first. Exception was: Communications link failure
The last packet successfully received from the server was 1,253,089,159,395 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
2009-09-16 10:19:20,403 WARN [main:com.mysql.sql] java.lang.Exception: MySQL server not running or accepting connections, retrying 48 times or 178 seconds, whichever expires
first. Exception was: Communications link failure
The last packet successfully received from the server was 1,253,089,160,403 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
2009-09-16 10:19:21,410 WARN [main:com.mysql.sql] java.lang.Exception: MySQL server not running or accepting connections, retrying 47 times or 177 seconds, whichever expires
first. Exception was: Communications link failure
The last packet successfully received from the server was 1,253,089,161,409 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
2009-09-16 10:19:22,416 WARN [main:com.mysql.sql] java.lang.Exception: MySQL server not running or accepting connections, retrying 46 times or 176 seconds, whichever expires
first. Exception was: Communications link failure
The last packet successfully received from the server was 1,253,089,162,416 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
2009-09-16 10:19:23,423 WARN [main:com.mysql.sql] java.lang.Exception: MySQL server not running or accepting connections, retrying 45 times or 175 seconds, whichever expires
first. Exception was: Communications link failure
The last packet successfully received from the server was 1,253,089,163,423 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
2009-09-16 10:19:24,430 WARN [main:com.mysql.sql] java.lang.Exception: MySQL server not running or accepting connections, retrying 44 times or 174 seconds, whichever expires
first. Exception was: Communications link failure
The last packet successfully received from the server was 1,253,089,164,430 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
2009-09-16 10:19:25,437 WARN [main:com.mysql.sql] java.lang.Exception: MySQL server not running or accepting connections, retrying 43 times or 173 seconds, whichever expires
first. Exception was: Communications link failure
The last packet successfully received from the server was 1,253,089,165,437 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
[root@dd01mvb-03 logs]# 2009-09-16 10:19:26,444 WARN [main:com.mysql.sql] java.lang.Exception: MySQL server not running or accepting connections, retrying 42 times or 172
seconds, whichever expires first. Exception was: Communications link failure
The last packet successfully received from the server was 1,253,089,166,444 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
2009-09-16 10:19:27,451 WARN [main:com.mysql.sql] java.lang.Exception: MySQL server not running or accepting connections, retrying 41 times or 171 seconds, whichever expires
first. Exception was: Communications link failure
The last packet successfully received from the server was 1,253,089,167,451 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
fg
tail -f catalina.out mysql-monitor.log
2009-09-16 10:19:28,461 WARN [main:com.mysql.sql] java.lang.Exception: MySQL server not running or accepting connections, retrying 40 times or 170 seconds, whichever expires
first. Exception was: Communications link failure
The last packet successfully received from the server was 1,253,089,168,461 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
So it is clear that it could not talk to the merlin "backend" mysqld.
However mysqld was running.
How to repeat:
N/A
Suggested fix:
Please fix the logging to indicate to which server the failure is taking place. Something like:
java.lang.Exception: Unable to reach MySQL server d-merlin-vip.dqs.lhr1.bbb.com:4041 as user service_manager, retrying 49 times or 179 seconds, whichever expires first.
Exception was: Communications link failure
This would make it clearer to which server tomcat was having difficulties, and as which user.
I then assumed that the agent was not running and restarted it. However the agent was running. The problem was that one of the configured instances was not reachable. It seems that because of that the proxy was not working. This behaviour does not seem to be logical as the proxy configuration is completely independent in our case from the agent's monitored instances.
I removed the "offending" instance (which was not running) from etc/instances/) and restarted the agent. The proxy started working. Hence tomcat was able to speak to the backend merlin mysqld instance.