Description:
So I am writing 2 new rules that need to know the version of MySQL that's being monitored. Should be easy because there's supposed to be a DC item called 'version':
mysql> show global variables like 'version';
+---------------+----------------------+
| Variable_name | Value |
+---------------+----------------------+
| version | 5.0.28-enterprise-nt |
+---------------+----------------------+
And indeed the Merlin server knows about it. We have a 'version' DC item for the mysql server and for the OS:
+---------+-----------+--------+---------+
| item_id | namespace | source | attrib |
+---------+-----------+--------+---------+
| 494 | mysql | server | version |
| 47 | os | os | version |
+---------+-----------+--------+---------+
Unfortunately, when I tried to schedule the rule I got a message saying 'version' wasn't known. So I looked in dc_known_items and it only knows about the OS version, not the MySQL version:
mysql> select i.item_id, namespace, source, attrib from dc_items i, dc_known_ite
ms k where i.item_id = k.item_id and k.agent_id = 1 and i.attrib='version' order
by attrib;
+---------+-----------+--------+---------+
| item_id | namespace | source | attrib |
+---------+-----------+--------+---------+
| 47 | os | os | version |
+---------+-----------+--------+---------+
When I set log-level=debug in the agent's INI file, I see the agent only reporting one 'version' item:
<itemList><attribType>INTEGER</attribType><nameSpace>mysql</nameSpace><className>server</className><attribName>Uptime</attribName></itemList>
<itemList><attribType>FLOAT</attribType><nameSpace>os</nameSpace><className>disk</className><attribName>used</attribName><isCounter>false</isCounter></itemList>
<itemList><attribType>VARCHAR</attribType><nameSpace>os</nameSpace><className>os</className><attribName>version</attribName><isCounter>false</isCounter></itemList>
<itemList><attribType>VARCHAR</attribType><nameSpace>mysql</nameSpace><className>server</className><attribName>version_comment</attribName></itemList>
So I think there's an agent bug that requires each attribute to have a unique name even if they're in a different namespace/source.
How to repeat:
Create and schedule a rule that uses the MySQL version data item.
Suggested fix:
I found the following in merlin-agent.c:
const os_dcs known_os_dcs[] = {
{ "os", "os", "name", 0, DC_TYPE_VARCHAR },
{ "os", "os", "version", 0, DC_TYPE_VARCHAR },
{ "os", "cpu", "name", 0, DC_TYPE_VARCHAR },
{ "os", "cpu", "cpu_idle", 0, DC_TYPE_FLOAT },
{ "os", "cpu", "cpu_sys", 0, DC_TYPE_FLOAT },
{ "os", "cpu", "cpu_wait", 0, DC_TYPE_FLOAT },
{ "os", "cpu", "cpu_user", 0, DC_TYPE_FLOAT },
{ "os", "mem", "ram_total", 0, DC_TYPE_FLOAT },
{ "os", "mem", "ram_unused", 0, DC_TYPE_FLOAT },
{ "os", "mem", "swap_total", 0, DC_TYPE_FLOAT },
{ "os", "mem", "swap_unused", 0, DC_TYPE_FLOAT },
{ "os", "disk", "capacity", 0, DC_TYPE_FLOAT },
{ "os", "disk", "used", 0, DC_TYPE_FLOAT },
{ "os", "disk", "bytes_in", 1, DC_TYPE_FLOAT },
{ "os", "disk", "bytes_out", 1, DC_TYPE_FLOAT },
{ "os", "disk", "name", 0, DC_TYPE_VARCHAR },
You can see there's an OS DC item named 'version'. You can also see 3 items named 'name' (for os, cpu, & disk), but only one (disk) appears in dc_known_items.
So I made the following change:
- { "os", "os", "name", 0, DC_TYPE_VARCHAR },
- { "os", "os", "version", 0, DC_TYPE_VARCHAR },
+ { "os", "os", "os_name", 0, DC_TYPE_VARCHAR },
+ { "os", "os", "os_version", 0, DC_TYPE_VARCHAR },
- { "os", "cpu", "name", 0, DC_TYPE_VARCHAR },
+ { "os", "cpu", "cpu_name", 0, DC_TYPE_VARCHAR },
- { "os", "disk", "name", 0, DC_TYPE_VARCHAR },
+ { "os", "disk", "disk_name", 0, DC_TYPE_VARCHAR },
I also updated the server's items-os.xml file accordingly. After recompiling the agent and server, everything now works fine.
An alternate approach to fixing this is to allow duplicate attrib names as long as they're in a different namespace/source. I suspect the agent is keeping a hash of DC items based on attrib name only. However, I still think giving these items different attrib names in this case is a good idea from an end user (writing rules) perspective and it allows us to fix the problem now.