MySQL Bugs: #11256: Slave takes one hour to reconnect to master after network outage

Bug #11256	Slave takes one hour to reconnect to master after network outage
Submitted:	10 Jun 2005 23:40	Modified:	12 Jul 2006 21:14
Reporter:	Aaron Eddy	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S3 (Non-critical)
Version:	4.0.23-standard-log	OS:	Linux (RHES release 3 (Taroon Update 4))
Assigned to:		CPU Architecture:	Any

Description:
We have a slave running under mysqld_multi.  After losing the network connection to the master (even for a short time), the slave takes 1 hour to reconnect.  There are no errors or warnings in the error log.

How to repeat:
Run slave machine under mysqld_multi, using the default --master-connect-retry (Connect_retry=60) option.  Temporarily down the slave's network connection with:

/sbin/ifconfig eth0 down

Wait 5 minutes (although the length of time doesn't seem to matter), and bring up the network interface.  Update the database on the master.  They won't be replicated to the slave for exactly 1 hour since the beginning of the network outage.  

Setting the Connect_retry to a lower number with:

CHANGE MASTER TO MASTER_CONNECT_RETRY =5;

has no effect, it still takes one hour.

Thank you for taking the time to write to us, but this is not a bug.

As described here: http://dev.mysql.com/doc/refman/4.1/en/replication-options.html first reconnection happens when slave-net-timeout is due and only after it slave tries to reconnect every master-connect-retry seconds. Default value for slave-net-timeout is 3600 seconds.