Bug #80686 Cannot Add a Server with a UUID that has Previously Been Used
Submitted: 10 Mar 2016 10:33 Modified: 4 May 2016 18:48
Reporter: James Bewley Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Enterprise Monitor Severity:S1 (Critical)
Version:3.1.1 OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: fabric, re-add, UUID

[10 Mar 2016 10:33] James Bewley
Description:
When a server in the replicated setup has been offline for a while MySQL Monitor looses it's ability to monitor it and reports it as 'unreachable'.  When the server is brought back on-line, it can be contacted by other services on the same machine but MySQL Monitor remains in an 'unreachable' state.

If you try to delete the server and re-add it MySQL monitor doesn't report any failures via the web UI but the server never shows up.

Checking the logs shows that is still holding onto some configuration for the UUID and is deciding to use a different IP address than the one specified.

2016-03-10 10:27:58,724 INFO [configMysql-task-1::com.mysql.etools.agent.collection.MysqlConnection] Successfully connected to mysql via 10.44.13.144:3306
2016-03-10 10:27:58,744 INFO [configMysql-task-1::com.mysql.etools.agent.order.MysqlConnectionService] Edited connection 7c663507-43b3-36cc-a976-af67a8abe3f8
2016-03-10 10:27:58,960 INFO [configMysql-task-1::com.mysql.etools.agent.collection.mysql.ServerIdentityProvider$ServerUuidSource] Creating new MySQL server identity of d2a45e56-db0e-11e5-b450-00155d025943 based on global server_uuid
2016-03-10 10:27:58,963 INFO [configMysql-task-1::com.mysql.etools.agent.collection.MysqlConnection] Successfully determined mysql 10.44.13.144:3306 has uuid d2a45e56-db0e-11e5-b450-00155d025943
2016-03-10 10:27:58,964 INFO [configMysql-task-1::com.mysql.etools.agent.collection.MysqlConnection] Successfully determined mysql 10.44.13.144:3306 has host id {com.mysql.etools.inventory.model.os.Host : sid:{S-1-5-21-944376373-3705882341-2035687038}}
2016-03-10 10:28:19,693 INFO [mysql-availability-ping-10.44.13.114:3306-thread-0::com.mysql.etools.agent.collection.MysqlConnection] (repeated 26 times) timed out waiting for jdbc 10.44.13.114:3306
2016-03-10 10:28:19,693 INFO [mysql-availability-ping-10.44.13.114:3306-thread-0::com.mysql.etools.agent.collection.MysqlConnection] timed out waiting for jdbc 10.44.13.114:3306
2016-03-10 10:28:21,694 INFO [mysql-availability-ping-10.44.13.114:3306-thread-0::com.mysql.etools.agent.collection.MysqlConnection] timed out waiting for jdbc 10.44.13.114:3306
2016-03-10 10:28:26,801 ERROR [cme.schedule.Scheduler-15::com.mysql.etools.agent.collection.mysql.sys.SysSchemaProvider] Unable to check for the existence of the SYS schema
com.mysql.etools.agent.collection.MysqlConnection$BadConnectionBypassException: Creating driver instance; SQL []; Communications link failure

How to repeat:
Add a server to MySQL monitor
Delete it
Add it again with a different ID
Delete it
Add it again with the original IP

Suggested fix:
When a server is deleted ALL records for the UUID should be removed.
[18 Mar 2016 10:18] MySQL Verification Team
There is a way to make a permanent deletion. You could check to see if you still have some of the different UUID records or IP addresses on the inventory page.

https://dev.mysql.com/doc/mysql-monitor/3.1/en/mem-advanced-inventory-using.html

After deleting the server from the GUI, delete from the inventory page as well, if they still exist.

Then the server should be rediscovered and appear on the dashboard as any previous server information would be removed from the inventory.
[30 Mar 2016 17:27] Mark Matthews
Is this using an agent external to the service manager, or the agent built-in to the service manager? It looks like an external agent was stopped before it got the message to delete the configuration for the monitored target, and thus is still attempting to monitor it after it has been re-started. There are manual ways to clean that up, but we'd need to know if you're using the external agent to give clear instructions.
[4 Apr 2016 9:05] James Bewley
By 'Inventory Page' do you mean the page titles 'MySQL Instances'?  If so then yes, this is what I am doing and it does not permanently delete the instance.

I am using the internal agent shipped with MySQL monitor.

Best,
James
[4 Apr 2016 9:06] James Bewley
The agent shows up in the 'inventory page' as an instance with version 3.1.1.7806
[4 Apr 2016 9:49] MySQL Verification Team
I think you still mean the 'Instances' tab? My link wasn't very good, this link has the access and url needed for the inventory page (https://ServiceManagerHost:PortNumber/v3/inventory).

https://dev.mysql.com/doc/mysql-monitor/3.1/en/mem-advanced-inventory-gui.html

This is the higher level 'Inventory' pages link.
https://dev.mysql.com/doc/mysql-monitor/3.1/en/mem-advanced-inventory.html
[4 Apr 2016 9:55] James Bewley
Hi Roger,

I have deleted all instances and have only the agent left.

The inventory shows two entries in 'All Hosts'.  The first is the hostname of the server with mySQL montor installed and so is the second instance judging by the volumes attached to it.

The server I am trying to add doesn't show anywhere in here.

GLA2ARM01 {os.Os : sid:{S-1-5-21-944376373-3705882341-2035687038}} 

{os.Os : sid:{S-1-5-21-3492764395-430688799-2866433683}}
[4 Apr 2016 10:07] James Bewley
I found that MySQL monitor had kept hold of some configuration in the following:

agent.MysqlConnectionConfiguration 

Deleting this allowed me to add the servers.
[4 Apr 2016 10:26] MySQL Verification Team
OK, sounds like you are back on track, which is good. Developers will have to take a look and try and reproduce that problem from your description.
[4 Apr 2016 18:48] Mark Matthews
From the description, I'm not sure of the exact steps that happened here (mostly sequence, timing), but it *sounds* like, 

(1) the host being monitored was partitioned away from being able to reach the service manager, agent and the MySQL server being monitored.
(2) the mysql instance was deleted, and re-added, but the original agent did not remove the configuration to monitor the mysql instance, and thus it's being doubly-reported, unreachable in one case, up in the other, but not appearing in the user interface.

Part of the problem is with the ambiguity of the word "server", do you mean "host" or "MySQL server" in this case? Given that the IP address changed, did the mysqld instance get moved to another host, but kept the same UUID? If so, was the original agent decomissioned? 

If possible, and you remember, could you state specifically what things happened, on which hosts (use IPs if it helps), in which order to help us reproduce this? We test the scenario as described quite often, so it seems like there's some small bit of information missing that would help us reproduce this issue.
[5 May 2016 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".