Bug #84475 Frequent slave reconnects and zombies thread warnings when master is idle
Submitted: 11 Jan 2017 22:48 Modified: 23 Jan 2017 12:19
Reporter: Oliver Welter Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.7.11 OS:Ubuntu
Assigned to: CPU Architecture:Any
Tags: connect_retry, zombie dump thread

[11 Jan 2017 22:48] Oliver Welter
Description:
Prereq: Master - Slave Replication where master is seeing only few updates

If the master does not receive any "replicatable" statement within the slaves timeout window, the slave connection goes into timeout. As of 5.7.7 the default value for the slave net timeout is 60 seconds, so if no replicatable statement happens on the master for 60 seconds, the slave reconnects. 

In addition the master does not recognize the timeout and issues a "found a zombie dump thread with the same UUID.". It seems that that the frequent reconnects have a negative impact on the overall performance, at least the logs are filling up with useless warnings. 

The problem goes worse if you have a Master-Master setup with a "hot standby" slave - as the standby slave is not seeing any updates you see this any 60 seconds.

How to repeat:
Deploy Master-Slave replication with >5.7.7 and default settings, do not perform any writes on the master.

Suggested fix:
The slave should be able to "ping" the server before it decides to treat a connection as dead just because of no data being send. Alternative might be a keep alive packet send by the master to prevent a connection timeout.
[23 Jan 2017 12:19] MySQL Verification Team
Hello Oliver Welter,

Thank you for the report.
You haven't posted exact warning that you are seeing but knowing the behavior in previous issues I can assume it is a "Note" i.e "[Note] While initializing dump thread for slave with UUID <xxxxx-xxxx-xxx-xxxx-xxxxxx>, found a zombie dump thread with the same UUID. Master is killing the zombie dump thread(<number>)" Post Bug #72578 fix,  a NOTE is added on master's error log file when log_warnings is greater than 1. 

Please see the detailed explanation in Bug #84358 by Sven Sandberg, he explained the cases in which this "Note" is observed, and why this is expected behavior.

Thanks,
Umesh