Bug #67422 mysqld fails to switch DNS resolver if thread_cache_size > 0
Submitted: 30 Oct 2012 13:50
Reporter: Leandro Morgado Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Options Severity:S3 (Non-critical)
Version:5.5.28 OS:Linux
Assigned to: Marc Alff CPU Architecture:Any

[30 Oct 2012 13:50] Leandro Morgado
Description:
MySQL uses DNS to do reverse resolution (IP->Domain) of incoming connections. It caches the results in the host cache which can be cleared with FLUSH HOSTS. When a new connection arrives, it first checks the host cache and only then queries the DNS servers specified in /etc/resolv.conf.

In our scenario, the DNS servers specified in /etc/resolv.conf are changed (say from DNS1 to DNS2) whilst mysqld is running and the host cache is flushed. MySQL will now query DNS2 for reverse resolution. However, if the thread_cache_size is enabled (greater than 0), then mysqld fails to query the new DNS2 server and keeps using the old DNS1 server. B

How to repeat:
Short version is described below:

- Setup a nameserver (192.168.56.102) in resolv.conf of the host (192.168.56.101) where mysqld is running 
- Start mysqld (do not disable name resolution)
- Connect with a client from another host (192.168.56.102)
- run SHOW PROCESSLIST and note the hostname.domain of the client connection
- Change the nameserver in resolv.conf with another one (192.168.56.103) and disable the original one (kill it, filter it in iptables, make it unreachable)
- The new nameserver (192.168.56.103) returns different values for the 192.168.56.102 PTR record.
- Flush the host cache in mysqld with FLUSH HOSTS
- Connect the client again from 192.168.56.102
- run SHOW PROCESSLIST and check whether the hostname.domain of the 192.168.56.102 client has changed.

My tests showed that after changing resolv.conf and FLUSH HOSTS, mysqld would query the new DNS server and obtain it's results.

HOWEVER, if thread_cache_size >0 , then mysqld never switches to the new DNS server and keeps using the original one. FLUSH HOSTS doesn't make a difference. 

Suggested fix:
mysqld should query the new DNS server, regardless of thread_cache_size.
[30 Oct 2012 14:01] Leandro Morgado
The attached file contains a very detailed technical description of the tests made to find this bug.

Attachment: long_test_case_67422.txt (text/plain), 16.02 KiB.

[31 Oct 2012 3:34] Davi Arnaut
In glibc the resolver state is thread-local and is initialized at the time of the first name resolution. This means that any thread will likely read the configuration files (resolv.conf) only once.
[31 Oct 2012 8:02] Marc Alff
Hi Davi.

Yes, that would explain the behavior seen.

Thanks for commenting on this.

Regards,
-- Marc
[16 Nov 2012 16:55] Hartmut Holzgraefe
known glibc issue, see e.g.

  http://www.sourceware.org/ml/libc-alpha/2010-03/msg00021.html

there are different patches around but none seem to have made it
into glibc, or at least the 2.12 version used in CentOS 6.3, yet.

Debian has merged the original patch though, so Debian, Ubuntu 
and other Debian-based distributions are not affected ...

Strace shows that these on these distributions a stat(resolv.conf)
is performed on every client connect, and if resolv.conf has 
changed it is re-read.

With CentOS and unpatched libc i can only see a single open()
on resolv.conf when the first client connects, then never again.
[16 Nov 2012 22:28] Hartmut Holzgraefe
Adding a call to res_init() (on platforms where it is part of libc) at the very beginning of hostname_cache_refresh() seems to fix this
[7 Jan 2015 19:03] Sinisa Milivojevic
Verified, but a problem is in glibc.