Bug #67236 MySQL 5.6 host cache does not seem to record info uniquely by ip, host
Submitted: 15 Oct 2012 15:54 Modified: 15 Jan 2013 6:52
Reporter: Simon Mudd (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Performance Schema Severity:S3 (Non-critical)
Version:5.6.5,5.6.6,5.6.7-rc OS:Any
Assigned to: Marc Alff CPU Architecture:Any
Tags: windmill
Triage: Needs Triage: D2 (Serious)

[15 Oct 2012 15:54] Simon Mudd
Description:
The new MySQL 5.6 host_cache is supposed to give information on the ip to host mapping and also additional connection errors. Yet in 5.6.6. it seems to not contain information which is unique by ip, host which would mean that the cache would be wiped out in ms on a busy server (getting thousands of connects per second), thus completely defeating the purpose of the cache.

How to repeat:
root@my-5.6.6-master [performance_schema]> select ip, host, count(*) from host_cache group by ip, host having  count(*) > 1;
+----------------+-------------------------------------+----------+
| ip             | host                                | count(*) |
+----------------+-------------------------------------+----------+
| 10.AAA.CCC.5   | md01app-01.xxxxxxxxxxxxxxxxx.com    |       41 |
| 10.AAA.CCC.7   | md01app-02.xxxxxxxxxxxxxxxxx.com    |       42 |
| 10.BBB.DDD.174 | dc01kvmuser-24.dqs.xxxxxxxxxxxx.com |       42 |
+----------------+-------------------------------------+----------+
3 rows in set (0.01 sec)

Suggested fix:
Ensure that this table is unique by ip/host.

Also confirm in the documentation the expiry policy, which is, I guess removing entries which match MIN(FIRST_SEEN,IP,HOSTNAME)  ??
[15 Oct 2012 16:11] Simon Mudd
Given the default setting of the host_cache may not be large enough, please provide some idea of the overhead if the default value needs to be increased, both in memory size but also in resources (scanning a larger table takes longer, even if done in memory).
[15 Oct 2012 16:33] Simon Mudd
Maybe actually the uniqueness should be by ip, not (ip, host). Either way the issue exists.
[16 Oct 2012 2:25] Miguel Solorzano
Thank you for the bug report.
[16 Oct 2012 7:05] Simon Mudd
Note: also seen in 5.6.5.
[16 Oct 2012 15:59] Simon Mudd
Also seen in 5.6.7.
[22 Oct 2012 15:11] Marc Alff
Some findings and clarifications, after analysis.

1)

From the behavior seen and the analysis done, it seems that the host cache
internal data is actually ok, but some rows are displayed multiple times
in table performance_schema.host_cache

The data is not wiped under load, and the host cache is still functional.

The fact that the same IP can be reported multiple times is a verified bug.

2)

The host cache maintains internally a unique index, by IP address alone.
This index is implemented with a HASH.

3)

In terms of CPU overhead, given that a HASH is used, the CPU overhead is not expected to grow significantly as the number of client host grows, and it should definitively not grow in O(N).

4)

In terms of memory overhead, the host cache consist of at most N records, where each record is of sizeof(Host_entry), or 336 bytes currently.

Even with thousands of client machines, and when keeping stats for a lot of clients, a host cache of 10,000 entries cost around 3.2 Mb of memory.

5)

About the expiry policy,

- each time an entry is looked at, the LAST_SEEN column is updated
- when the table is full, the column with min(LAST_SEEN) is dropped.

That is, the least recently used (LRU) entry is dropped.

Regards,
-- Marc
[23 Oct 2012 5:34] Simon Mudd
Hi Marc,

Thanks for the update to various questions I asked. That info is most useful.  If you can please ensure it reaches the documentation pages.

My intention is to probably increase this cache size to something like 4096 on all 5.6 hosts as this should cover the current "expected" number of different possible clients by a sufficient margin, and thus reduce any further DNS lookups.

The fact I see more than a single row per ip is still a bit disconcerting as I was looking for data on "connect errors" and it looked like this table should provide a good overview of the data (except for the fact that the number of hosts connecting to the servers was too large to fit into the default size host-cache).

I gather you are working on that issue still.
[14 Jan 2013 16:37] Paul Dubois
Noted in 5.6.9, 5.7.0 changelogs.

The Performance Schema host_cache table displayed some lines multiple
times. This was not an issue with the host cache itself, only with
the table that provides information about the cache contents.
[14 Jan 2013 18:24] Paul Dubois
Added a note to
http://dev.mysql.com/doc/refman/5.6/en/host-cache.html
that if the host cache is full when an entry is to be stored,
the least recently used entry is discarded.
[14 Jan 2013 21:49] Simon Mudd
Is there a counter for this "discarded entry as host_cache full" status?
If not can we add one as otherwise it's impossible to know if this is an issue or not, and a full, but not overflowed host_cache table can't be distinguished from an overflowed host_cache table (that probably needs extending...).

If there's no such counter I'll add an appropriate feature request. I think that's important given the complete opaqueness of the host cache in 5.5. The (now fixed) 5.6 implementation resolves a lot of these issues and the counter would allow us to measure when the size is not big enough.
[14 Jan 2013 22:54] Marc Alff
Hi Simon.

Let's say you have N hosts.

With a host cache size of N, you are correct that there is no way to distinguish a full but not overflowed host cache from a full and overflowed host cache ...

A suggestion then: use a host cache size of N+1 when expecting N entries: a (almost) full and not overflowed host cache will have at most N rows, while a full and overflowed (?) host cache will have N+1 rows.

Monitoring count(*) to make sure there is at least 1 free entry guarantees that no overflow occurred, if this is the metric of interest here.

Another suggestion: by definition the next entry that will be discarded if needed to make room is the last recently used host, for which the LAST_SEEN timestamp is the minimum.

Monitoring the value of MIN(LAST_SEEN) indicates for how long the host cache has been unchanged.

For example, a MIN(LAST_SEEN) 5 days in the past means no entry has been evicted from the host cache during the last 5 days ...

On a side note, very few people get notifications for comments on closed bugs, which are definitively not monitored, I just happen to have seen this one since I am assigned to it.

Regards,
-- Marc Alff
[14 Jan 2013 23:18] Marc Alff
Correction to the previous comments ...

Monitoring MIN(LAST_SEEN) works only if some host rarely connects to the server, for example very infrequent connections from an admin console.

The distribution (and frequency) of connections from clients can have an impact here.

Count(*) < host_cache_size seems better.
[15 Jan 2013 6:52] Simon Mudd
Filed separate feature request: http://bugs.mysql.com/bug.php?id=68085.