Bug #42360 Support for MySQL + DRBD failover
Submitted: 27 Jan 3:12 Modified: 27 Jan 8:27
Reporter: Meiji KIMURA
Status: Verified
Category:Monitoring: Agent Severity:S4 (Feature request)
Version:2.0 OS:Any
Assigned to: Mark Matthews Target Version:
Tags: ha
Triage: Needs Triage: D5 (Feature request)

[27 Jan 3:12] Meiji KIMURA
Description:
I know that latest MEM(MySQL Enterprise Monitor) doen't support any HA configuration, so
agents don't support also that configuration.

I know that the feature request to MEM already filed as below,
Bug#39199

But please let me explain to concrete request to support specific configuration.

Recently many customer try to make a configuration of MySQL with DRBD as a reasonable HA
configuration. Under this configuration, I prepare two machines(Active/Standby), so DRBD
mirrors from Active to Standby. The heartbeat check Active machine and it will failover
to Standby machine when Active machine fails.

I want to continue to Standby machine after failover, but now it is very difficult to do
it. Please implement (or confirm) the feature(or procedure) described in 'Suggested
fix:'.

How to repeat:
Assume two machines for MySQL + DRBD. One is for Active, another is for Standby.

(1) Install DRBD for mirroring from Active to Standby.
(2) Install agents into each machines as local agent.
(3) I can monitor Active machine with MEM.
(4) After failover, I activate the agent for Standby machine.

But mysql.inventory alread mirrored from Active machine to Standby machine, so 
this error displayed in Agent log of Standby machine like this,

2009-01-26 13:53:03: (critical)
C:\cygwin\home\mysqldev\bs\merlin\agent-2.0\src\mysql-proxy-0.7.0r1185\plugins\agent\agent_mysqld.c:533:
[mysql] the hostid from mysql.inventory doesn't match our agent's host-id
(mac:{00166f6d8cb40001} != mac:{00166f6d8cb40000})
We assume that this is a cloned host and shutdown now.
Please TRUNCATE TABLE mysql.inventory on this mysql-instance and restart the agent.
If this is a master for replication, please also run SET SQL_LOG_BIN = 0; first.
2009-01-26 13:53:04: (critical) shutting down normally

Suggested fix:
[Workaround]

(1) Configuring an Agent to Monitor a Remote MySQL Server
http://dev.mysql.com/doc/refman/5.1/en/mem-install-agent-remote.html

  If I install an agent to Monitor a Remote(not local) MySQL Server and specify Virtual
IP for that agent, it will work well after failover. But there is a limitation for
getting an information OS and CPU of a machine of monitored MySQL. It is a severe
limitation for DBAs.

(2) TRUNCATE mysql.inventory table on a standby machine before activating an agent for
it.

  mysql.inventory was created when an agent was installed on an Active machine, so it was
mirrored from Active machine to Standby machine. It cause the error message of 'How to
repeat:' by inconsistency of host-id's.

I can continue to run by trancating mysql.inventory, but that operation re-create
mysql.inventory on a standby machine and old mysql.inventory on MEM dashboard won't be
deleted, so each operation makes old trash item. I don't know how to delete these items.

[Suggested fix]
(a) Please confirm to an operation of (2) (and add method for deleting obsoleted item on
MEM, or other method if needed) as an authorized procedure.
(b) Implement features for (a) as a new feature of MEM(or Agents).

Regards,
[28 Jan 3:19] Meiji KIMURA
[Suggested fix]
(3) provide a method to take over contens of mysql.inventory from Active to Client.

An example of contents of mysql.inventory is,
+--------+--------------------------------------+
| name   | value                                |
+--------+--------------------------------------+
| uuid   | 2273a4a8-4469-4779-9750-66bb33251bda |
| hostid | mac:{001c239da3e80000}               |
+--------+--------------------------------------+

uuid is changed by re-creating, but hostid is same.

I guess hostid is generated by these (1) or (2) methods.

(1) 'hostid' is generated by an identifier based on physical address of ethernet
connection without SSL. (Here is an output of ipconfig /all on windows.)

Ethernet adapter local area connection:
   Connection-specific DNS Suffix:
   Description . . . . . . . . . : Broadcom 440x 10/100 Integrated Controller
   Physical Address. . . . . . . : 00-1C-23-9D-A3-E8

(2) 'hostid' is generated by using these command. It is displayed by a warning message of
an agent.

(warning) agent.c:428: extracted the host-id from 'sh -c "ssh-keygen -l -f
/etc/ssh/ssh_host_key.pub | awk '{print \"ssh:{\"$2\"}\"}'"'

I know that it is difficult to change physical address of (1), but it is easy to change
an agument of (2) like,

'sh -c "ssh-keygen -l -f ssh_host_key_of_active_machine | awk '{print
\"ssh:{\"$2\"}\"}'"'

Please consider this option when you implement new features for this feature request.

Regards,
[27 May 14:30] Oli Sennhauser
Possibly duplicate of 40695.