Bug #20435 Relay logs are rotated at slave_net_timeout when there's no activity
Submitted: 13 Jun 2006 16:47 Modified: 28 Nov 2007 20:22
Reporter: Matthew Lord Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.0.22 OS:Linux (linux (x86), Solaris 9 (sparcv9))
Assigned to: Andrei Elkin CPU Architecture:Any
Tags: bfsm_2007_01_18

[13 Jun 2006 16:47] Matthew Lord
Description:
Relay logs are rotated every slave_net_timeout seconds if there are no statements being
replicated.

How to repeat:
I setup a master and slave on the same machine.  Set slave_net_timeout to 30 seconds.
Don't replicate any statements and watch the relay log rotate every 30 seconds.

Suggested fix:
We should simply reconnect to the master without rotating the relay log whenever we have a timeout.
[12 Dec 2006 18:58] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/16851

ChangeSet@1.2347, 2006-12-12 20:58:02+02:00, aelkin@dsl-hkibras-fe30f900-107.dhcp.inet.fi +1 -0
  Bug #20435 Relay logs are rotated at slave_net_timeout when there's no activity
  
  Rotate events were generated locally on slave after reconnecting to the master
  upon slave_net_timeout expired while there were no events from master.
  That's the way how failure detection for master was originally implemented.
  
  Leaving aside the algorithm of failure detection (the first patch tries to solve
  rotation problem from that perspective) we refine behavour on slave's
  side to not rotate relay log files when master does not bring with rotate event
  its current postion different from slave's view. Relay logs don't rotate while
  master postion in rotate events is stable.
[13 Dec 2006 14:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/16882

ChangeSet@1.2347, 2006-12-13 16:21:14+02:00, aelkin@dsl-hkibras-fe30f900-107.dhcp.inet.fi +1 -0
  Bug #20435 Relay logs are rotated at slave_net_timeout when there's no activity
  
  Rotate events were generated locally on slave after reconnecting to the master
  upon slave_net_timeout expired while there were no events from master.
  That's the way how failure detection for master was originally implemented.
  
  Leaving aside the algorithm of failure detection (the first patch tries to solve
  rotation problem from that perspective) we refine behavour on slave's
  side to not rotate relay log files when master does not rotate itself when it brings
  with rotate event binlog postion the same as slave already knows.
  
  This remains valid even though master was stopped and downgraded. After
  reconnecting slave would receive first rotate and FD and other events of the last
  binlog where it was interupped to receive from, and only after that rotate and FD of
  new binlog of downgranded format version.
  
  If slave reconnects to all time online master and gets with rotate
  the same position it knows then rotate event is discarded, relay log files remain
  untouched also the event is not put into the current log.
  
  The latter applies to reconnecting after slave_net_timeout which repairs from the bug.
[15 Jan 2007 17:19] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18134

ChangeSet@1.2347, 2007-01-15 19:18:48+02:00, aelkin@dsl-hkibras-fe36f900-97.dhcp.inet.fi +3 -0
  Bug #20435 Relay logs are rotated at slave_net_timeout when there's no activity
  
  When receiving no events from its master longer than slave_net_timeout
  slave's replication io thread disconnects and reconnects. Reconnecting causes rotation
  of the relay log. That is unnecessary work which also causes inconviniece because
  of changes of the relay log files names: old files are removed, new are created.
  
  The behavour of slave is refined not to rotate relay log files when master does not
  rotate itself. Locally generated slave rotate events or master's events
  that bring the binlog postion the same as slave already knows are ignorable.
  This remains valid even though master was stopped and downgraded. After
  reconnecting to a downgraded master, slave would receive first rotate and FD and
  other events of the last binlog where it was interupped to receive from, and only
  after that rotate and FD of new binlog of downgranded format version.
  If slave reconnects to all time online master and gets with rotate
  the same position it knows, then rotate event is discarded, relay log files remain
  untouched also the event is not put into the current log.
  
  The latter applies to reconnecting after slave_net_timeout which repairs
  the bug.
[24 Jan 2007 17:05] Jonathan Miller
Hi,

I have some testing to do for Rafal today. As soon as completed, I will start testing on this for you. What tree is this currently committed (pushed) to so that I can pull?

Best wishes,
/Jeb
[24 Jan 2007 20:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18750

ChangeSet@1.2347, 2007-01-24 22:55:09+02:00, aelkin@dsl-hkibras-fe36f900-97.dhcp.inet.fi +4 -0
  Bug #20435 Relay logs are rotated at slave_net_timeout when there's no activity
  
  When receiving no events from its master longer than slave_net_timeout
  slave's replication io thread disconnects and reconnects. Reconnecting causes rotation
  of the relay log. That is unnecessary work which also causes inconviniece because
  of changes of the relay log files names: old files are removed, new are created.
  
  The behavour of slave is refined not to rotate relay log files when master does not
  rotate itself. Locally generated slave rotate events or master's events
  that bring the binlog postion the same as slave already knows are ignorable.
  Also ignorable are FD events what have artificial mark with the exception of
  very first FD event slave's io thread has to accept even if one is artificial.
  This remains valid even though master was stopped and downgraded. After
  reconnecting to a downgraded master, slave would receive first rotate and FD and
  other events of the last binlog where it was interupped to receive from, and only
  after that rotate and FD of new binlog of downgranded format version.
  If slave reconnects to all time online master and gets with rotate
  the same position it knows, then rotate event is discarded, relay log files remain
  untouched also the event is not put into the current log.
  
  The latter applies to reconnecting after slave_net_timeout which repairs
  the bug.
[25 Jan 2007 12:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18768

ChangeSet@1.2347, 2007-01-25 14:03:08+02:00, aelkin@dsl-hkibras-fe36f900-97.dhcp.inet.fi +6 -0
  Bug #20435 Relay logs are rotated at slave_net_timeout when there's no activity
  
  When receiving no events from its master longer than slave_net_timeout
  slave's replication io thread disconnects and reconnects. Reconnecting causes rotation
  of the relay log. That is unnecessary work which also causes inconviniece because
  of changes of the relay log files names: old files are removed, new are created.
  
  The behavour of slave is refined not to rotate relay log files when master does not
  rotate itself. Locally generated slave rotate events or master's events
  that bring the binlog postion the same as slave already knows are ignorable.
  Also ignorable are FD events what have artificial mark with the exception of
  very first FD event slave's io thread has to accept even if one is artificial.
  This remains valid even though master was stopped and downgraded. After
  reconnecting to a downgraded master, slave would receive first rotate and FD and
  other events of the last binlog where it was interupped to receive from, and only
  after that rotate and FD of new binlog of downgranded format version.
  If slave reconnects to all time online master and gets with rotate
  the same position it knows, then rotate event is discarded, relay log files remain
  untouched also the event is not put into the current log.
  
  The latter applies to reconnecting after slave_net_timeout which repairs
  the bug.
  
  disconnect_slave_event_counter appears to to be affected since the corresponding running
  counter was adjusted at process_io_rotate.
  no such issue is noticed for abort_slave_event_counter.
[11 Sep 2007 16:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/34051

ChangeSet@1.2570, 2007-09-11 19:29:27+03:00, aelkin@koti.dsl.inet.fi +15 -0
  WL#342 heartbeat and associated bug#20435 Relay logs are rotated at slave_net_timeout when there's no activity.
  
  The second changeset to be put over the first.
  
  Augmenting slave's handling of the event with rejecting any mismatch of binlog names except when slave has not
  not assigned its version at master_info.
  Agreegating file name and position into a new coordinate struct.
  Using the struct's instance as transport from read event to send routine.
  Reading of file name is made even more safe as it's protected by LOCK_open.
  Status of received events on slave side is added.
[1 Oct 2007 19:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/34734

ChangeSet@1.2578, 2007-10-01 22:19:52+03:00, aelkin@koti.dsl.inet.fi +82 -0
  Bug#20435 Relay logs are rotated at slave_net_timeout when there's no activity; WL#342 heartbeat
  
    When receiving no events from its master longer than slave_net_timeout
    slave's replication io thread disconnects and reconnects. Reconnecting causes rotation
    of the relay log. That is unnecessary work which also causes inconviniece because
    of changes of the relay log files names: old files are removed, new are created.
  
  Fixed with introducing the Heartbeat event new functionality.
  Heartbeat event is generated in master's idle time by dump thread.
  Frequency of sending the event is optional and is determined by slave.
  The optimal value is within [0.001, slave_net_timeout] interval.
  
  The project introduces master_heartbeat_period option for CHANGE MASTER sql clause.
  The requested on the slave side value for the period is passed to the dump thread on
  the master side.
  The dump thread sends a hearbeat replication event if there is no
  more unsent events in the actual binlog file for a period longer that
  master_heartbeat_period. 
  Whenever the master's binlog is updated with an event, the waiting
  for heartbeat sending condition gets reset.
  
  Heartbeating is requested implicitly with the period slave_net_timeout/2 
  when no master_heartbeat_option was provided. if the option's value is set explicitly zero
  there will be no heartbeats.
  
  Two status variables on the slave side allows to monitor heartbeats flow.
  
  The test checks the syntax for the new option, the valid range - errors and warnings on reasonable values;
  the fact that there is no relay log rotation (thereafter no reconnection) while more than slave_net_timeout
  seconds ellapsed and the master has been idling; new status variables.
[2 Oct 2007 7:51] Andrei Elkin
One more client for this project is Bug #11820.
[9 Oct 2007 19:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35226

ChangeSet@1.2653, 2007-10-09 22:22:46+03:00, aelkin@koti.dsl.inet.fi +79 -0
  Bug#20435 Relay logs are rotated at slave_net_timeout when there's no activity; WL#342 heartbeat
  
  When receiving no events from its master longer than
  slave_net_timeout slave's replication io thread disconnects and
  reconnects. Reconnecting causes rotation of the relay log. That
  is unnecessary work which also causes inconviniece because of
  changes of the relay log files names: old files are removed, new
  are created.
  
  Fixed with introducing the Heartbeat event new functionality.
  Heartbeat event is generated in master's idle time by dump thread.
  Frequency of sending the event is optional and is determined by slave.
  The optimal value is within [0.001, slave_net_timeout] interval.
  
  The project introduces master_heartbeat_period option for CHANGE
  MASTER sql clause.  The requested on the slave side value for the
  period is passed to the dump thread on the master side.  The dump
  thread sends a hearbeat replication event if there is no more
  unsent events in the actual binlog file for a period longer that
  master_heartbeat_period.  Whenever the master's binlog is updated
  with an event, the waiting for heartbeat sending condition gets
  reset.
  
  Heartbeating is requested implicitly with the period
  slave_net_timeout/2 when no master_heartbeat_option was
  provided. if the option's value is set explicitly zero there will
  be no heartbeats.
  
  Two status variables on the slave side allows to monitor
  heartbeats flow.
  
  The test checks the syntax for the new option, the valid range - errors and warnings on reasonable values;
  the fact that there is no relay log rotation (thereafter no reconnection) while more than slave_net_timeout
  seconds ellapsed and the master has been idling; new status variables.
[9 Oct 2007 19:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35227

ChangeSet@1.2653, 2007-10-09 22:26:01+03:00, aelkin@koti.dsl.inet.fi +79 -0
  Bug#20435 Relay logs are rotated at slave_net_timeout when there's no activity; WL#342 heartbeat
  
  When receiving no events from its master longer than
  slave_net_timeout slave's replication io thread disconnects and
  reconnects. Reconnecting causes rotation of the relay log. That
  is unnecessary work which also causes inconviniece because of
  changes of the relay log files names: old files are removed, new
  are created.
  
  Fixed with introducing the Heartbeat event new functionality.
  Heartbeat event is generated in master's idle time by dump thread.
  Frequency of sending the event is optional and is determined by slave.
  The optimal value is within [0.001, slave_net_timeout] interval.
  
  The project introduces master_heartbeat_period option for CHANGE
  MASTER sql clause.  The requested on the slave side value for the
  period is passed to the dump thread on the master side.  The dump
  thread sends a hearbeat replication event if there is no more
  unsent events in the actual binlog file for a period longer that
  master_heartbeat_period.  Whenever the master's binlog is updated
  with an event, the waiting for heartbeat sending condition gets
  reset.
  
  Heartbeating is requested implicitly with the period
  slave_net_timeout/2 when no master_heartbeat_option was
  provided. if the option's value is set explicitly zero there will
  be no heartbeats.
  
  Two status variables on the slave side allows to monitor
  heartbeats flow.
  
  The test checks the syntax for the new option, the valid range - errors and warnings on reasonable values;
  the fact that there is no relay log rotation (thereafter no reconnection) while more than slave_net_timeout
  seconds ellapsed and the master has been idling; new status variables.
[10 Oct 2007 8:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35264

ChangeSet@1.2653, 2007-10-10 11:13:36+03:00, aelkin@koti.dsl.inet.fi +79 -0
  Bug#20435 Relay logs are rotated at slave_net_timeout when there's no activity; WL#342 heartbeat
  
  When receiving no events from its master longer than
  slave_net_timeout slave's replication io thread disconnects and
  reconnects. Reconnecting causes rotation of the relay log. That
  is unnecessary work which also causes inconviniece because of
  changes of the relay log files names: old files are removed, new
  are created.
  
  Fixed with introducing the Heartbeat event new functionality.
  Heartbeat event is generated in master's idle time by dump thread.
  Frequency of sending the event is optional and is determined by slave.
  The optimal value is within [0.001, slave_net_timeout] interval.
  
  The project introduces master_heartbeat_period option for CHANGE
  MASTER sql clause.  The requested on the slave side value for the
  period is passed to the dump thread on the master side.  The dump
  thread sends a hearbeat replication event if there is no more
  unsent events in the actual binlog file for a period longer that
  master_heartbeat_period.  Whenever the master's binlog is updated
  with an event, the waiting for heartbeat sending condition gets
  reset.
  
  Heartbeating is requested implicitly with the period
  slave_net_timeout/2 when no master_heartbeat_option was
  provided. if the option's value is set explicitly zero there will
  be no heartbeats.
  
  Two status variables on the slave side allows to monitor
  heartbeats flow.
  
  The test checks the syntax for the new option, the valid range - errors and warnings on reasonable values;
  the fact that there is no relay log rotation (thereafter no reconnection) while more than slave_net_timeout
  seconds ellapsed and the master has been idling; new status variables.
[10 Oct 2007 12:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35277

ChangeSet@1.2653, 2007-10-10 15:13:15+03:00, aelkin@koti.dsl.inet.fi +74 -0
  Bug#20435 Relay logs are rotated at slave_net_timeout when there's no activity; WL#342 heartbeat
  
  When receiving no events from its master longer than
  slave_net_timeout slave's replication io thread disconnects and
  reconnects. Reconnecting causes rotation of the relay log. That
  is unnecessary work which also causes inconviniece because of
  changes of the relay log files names: old files are removed, new
  are created.
  
  Fixed with introducing the Heartbeat event new functionality.
  Heartbeat event is generated in master's idle time by dump thread.
  Frequency of sending the event is optional and is determined by slave.
  The optimal value is within [0.001, slave_net_timeout] interval.
  
  The project introduces master_heartbeat_period option for CHANGE
  MASTER sql clause.  The requested on the slave side value for the
  period is passed to the dump thread on the master side.  The dump
  thread sends a hearbeat replication event if there is no more
  unsent events in the actual binlog file for a period longer that
  master_heartbeat_period.  Whenever the master's binlog is updated
  with an event, the waiting for heartbeat sending condition gets
  reset.
  
  Heartbeating is requested implicitly with the period
  slave_net_timeout/2 when no master_heartbeat_option was
  provided. if the option's value is set explicitly zero there will
  be no heartbeats.
  
  Two status variables on the slave side allows to monitor
  heartbeats flow.
  
  The test checks the syntax for the new option, the valid range - errors and warnings on reasonable values;
  the fact that there is no relay log rotation (thereafter no reconnection) while more than slave_net_timeout
  seconds ellapsed and the master has been idling; new status variables.
  
  Lots of affected tests or results are due to FD size change.
[11 Oct 2007 14:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35365

ChangeSet@1.2653, 2007-10-11 17:31:41+03:00, aelkin@koti.dsl.inet.fi +77 -0
  Bug#20435 Relay logs are rotated at slave_net_timeout when there's no activity; WL#342 heartbeat
  
  When receiving no events from its master longer than slave_net_timeout
  slave's replication io thread disconnects and reconnects. Reconnecting
  causes rotation of the relay log. That is unnecessary work which also
  causes inconvenience because of changes of the relay log files names:
  old files are removed, new are created.
    
    Fixed with introducing the Heartbeat event new functionality.
    Heartbeat event is generated in master's idle time by dump thread.
    Frequency of sending the event is optional and is determined by slave.
    The optimal value is within [0.001, slave_net_timeout] interval.
    
    The project introduces master_heartbeat_period option for CHANGE
    MASTER sql clause.  The requested on the slave side value for the
    period is passed to the dump thread on the master side.  The dump
    thread sends a heartbeat replication event if there is no more
    unsent events in the actual binlog file for a period longer that
    master_heartbeat_period.  Whenever the master's binlog is updated
    with an event, the waiting for heartbeat sending condition gets
    reset.
    
    Heartbeating is requested implicitly with the period
    slave_net_timeout/2 when no master_heartbeat_option was
    provided. if the option's value is set explicitly zero there will
    be no heartbeats.
    slave_net_timeout is updated now with generating a warning if the
    new value is less than the current heartbeat period.
  
    Two status variables on the slave side allows to monitor
    heartbeats flow.
    
    The test checks the syntax for the new option, the valid range -
    errors and warnings on reasonable values; the fact that there is no
    relay log rotation (thereafter no reconnection) while more than
    slave_net_timeout seconds elapsed and the master has been idling;
    new status variables.
[11 Oct 2007 18:51] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35401

ChangeSet@1.2653, 2007-10-11 21:51:26+03:00, aelkin@koti.dsl.inet.fi +78 -0
  Bug#20435 Relay logs are rotated at slave_net_timeout when there's no activity; WL#342 heartbeat
  
  When receiving no events from its master longer than slave_net_timeout
  slave's replication io thread disconnects and reconnects. Reconnecting
  causes rotation of the relay log. That is unnecessary work which also
  causes inconvenience because of changes of the relay log files names:
  old files are removed, new are created.
    
    Fixed with introducing the Heartbeat event new functionality.
    Heartbeat event is generated in master's idle time by dump thread.
    Frequency of sending the event is optional and is determined by slave.
    The optimal value is within [0.001, slave_net_timeout] interval.
    
    The project introduces master_heartbeat_period option for CHANGE
    MASTER sql clause.  The requested on the slave side value for the
    period is passed to the dump thread on the master side.  The dump
    thread sends a heartbeat replication event if there is no more
    unsent events in the actual binlog file for a period longer that
    master_heartbeat_period.  Whenever the master's binlog is updated
    with an event, the waiting for heartbeat sending condition gets
    reset.
    
    Heartbeating is requested implicitly with the period
    slave_net_timeout/2 when no master_heartbeat_option was
    provided. if the option's value is set explicitly zero there will
    be no heartbeats.
    slave_net_timeout is updated now with generating a warning if the
    new value is less than the current heartbeat period.
  
    Two status variables on the slave side allows to monitor
    heartbeats flow.
    
    The test checks the syntax for the new option, the valid range -
    errors and warnings on reasonable values; the fact that there is no
    relay log rotation (thereafter no reconnection) while more than
    slave_net_timeout seconds elapsed and the master has been idling;
    new status variables.
[12 Oct 2007 19:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35492

ChangeSet@1.2653, 2007-10-12 22:26:09+03:00, aelkin@koti.dsl.inet.fi +79 -0
  Bug#20435 Relay logs are rotated at slave_net_timeout when there's no activity; WL#342 heartbeat
  
  When receiving no events from its master longer than slave_net_timeout
  slave's replication io thread disconnects and reconnects. Reconnecting
  causes rotation of the relay log. That is unnecessary work which also
  causes inconvenience because of changes of the relay log files names:
  old files are removed, new are created.
    
    Fixed with introducing the Heartbeat event new functionality.
    Heartbeat event is generated in master's idle time by dump thread.
    Frequency of sending the event is optional and is determined by slave.
    The optimal value is within [0.001, slave_net_timeout] interval.
    
    The project introduces master_heartbeat_period option for CHANGE
    MASTER sql clause.  The requested on the slave side value for the
    period is passed to the dump thread on the master side.  The dump
    thread sends a heartbeat replication event if there is no more
    unsent events in the actual binlog file for a period longer that
    master_heartbeat_period.  Whenever the master's binlog is updated
    with an event, the waiting for heartbeat sending condition gets
    reset.
    
    Heartbeating is requested implicitly with the period
    slave_net_timeout/2 when no master_heartbeat_option was
    provided. if the option's value is set explicitly zero there will
    be no heartbeats.
    slave_net_timeout is updated now with generating a warning if the
    new value is less than the current heartbeat period.
  
    Two status variables on the slave side allows to monitor
    heartbeats flow.
    
    The test checks the syntax for the new option, the valid range -
    errors and warnings on reasonable values; the fact that there is no
    relay log rotation (thereafter no reconnection) while more than
    slave_net_timeout seconds elapsed and the master has been idling;
    new status variables.
[12 Oct 2007 23:03] James Day
DOCS note: we've had people suffering from their internet links going down due to timeouts when replication is idle (think of a dialup of VPN hanging up) so this option should also be described as a solution for network connection timeout issues.
[13 Oct 2007 10:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35508

ChangeSet@1.2653, 2007-10-13 13:46:59+03:00, aelkin@koti.dsl.inet.fi +79 -0
  Bug#20435 Relay logs are rotated at slave_net_timeout when there's no activity; WL#342 heartbeat
  
  When receiving no events from its master longer than slave_net_timeout
  slave's replication io thread disconnects and reconnects. Reconnecting
  causes rotation of the relay log. That is unnecessary work which also
  causes inconvenience because of changes of the relay log files names:
  old files are removed, new are created.
    
    Fixed with introducing the Heartbeat event new functionality.
    Heartbeat event is generated in master's idle time by dump thread.
    Frequency of sending the event is optional and is determined by slave.
    The optimal value is within [0.001, slave_net_timeout] interval.
    
    The project introduces master_heartbeat_period option for CHANGE
    MASTER sql clause.  The requested on the slave side value for the
    period is passed to the dump thread on the master side.  The dump
    thread sends a heartbeat replication event if there is no more
    unsent events in the actual binlog file for a period longer that
    master_heartbeat_period.  Whenever the master's binlog is updated
    with an event, the waiting for heartbeat sending condition gets
    reset.
    
    Heartbeating is requested implicitly with the period
    slave_net_timeout/2 when no master_heartbeat_option was
    provided. if the option's value is set explicitly zero there will
    be no heartbeats.
    slave_net_timeout is updated now with generating a warning if the
    new value is less than the current heartbeat period.
  
    Two status variables on the slave side allows to monitor
    heartbeats flow.
    
    The test checks the syntax for the new option, the valid range -
    errors and warnings on reasonable values; the fact that there is no
    relay log rotation (thereafter no reconnection) while more than
    slave_net_timeout seconds elapsed and the master has been idling;
    new status variables.
[13 Oct 2007 20:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35515

ChangeSet@1.2605, 2007-10-13 22:41:12+02:00, aelkin@dl145h.mysql.com +79 -0
  Bug#20435 Relay logs are rotated at slave_net_timeout when there's no activity; WL#342 heartbeat
  
  When receiving no events from its master longer than slave_net_timeout
  slave's replication io thread disconnects and reconnects. Reconnecting
  causes rotation of the relay log. That is unnecessary work which also
  causes inconvenience because of changes of the relay log files names:
  old files are removed, new are created.
    
    Fixed with introducing the Heartbeat event new functionality.
    Heartbeat event is generated in master's idle time by dump thread.
    Frequency of sending the event is optional and is determined by slave.
    The optimal value is within [0.001, slave_net_timeout] interval.
    
    The project introduces master_heartbeat_period option for CHANGE
    MASTER sql clause.  The requested on the slave side value for the
    period is passed to the dump thread on the master side.  The dump
    thread sends a heartbeat replication event if there is no more
    unsent events in the actual binlog file for a period longer that
    master_heartbeat_period.  Whenever the master's binlog is updated
    with an event, the waiting for heartbeat sending condition gets
    reset.
    
    Heartbeating is requested implicitly with the period
    slave_net_timeout/2 when no master_heartbeat_option was
    provided. if the option's value is set explicitly zero there will
    be no heartbeats.
    slave_net_timeout is updated now with generating a warning if the
    new value is less than the current heartbeat period.
  
    Two status variables on the slave side allows to monitor
    heartbeats flow.
    
    The test checks the syntax for the new option, the valid range -
    errors and warnings on reasonable values; the fact that there is no
    relay log rotation (thereafter no reconnection) while more than
    slave_net_timeout seconds elapsed and the master has been idling;
    new status variables.
[15 Oct 2007 9:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35549

ChangeSet@1.2606, 2007-10-15 12:38:21+03:00, aelkin@dsl-hkibras1-ff5cc300-91.dhcp.inet.fi +4 -0
  bug#20435 wl#342 heartbeat
  
  addressing windows' failure and warnings.
[15 Oct 2007 12:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35558

ChangeSet@1.2655, 2007-10-15 14:59:27+02:00, aelkin@dl145k.mysql.com +1 -0
  bug#20435 wl#342 heartbeat
  
  correction for a warning on win
[15 Oct 2007 13:04] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35560

ChangeSet@1.2607, 2007-10-15 16:03:47+03:00, aelkin@dsl-hkibras1-ff5cc300-91.dhcp.inet.fi +1 -0
  bug#20435 wl#342 heartbeat
  
  correcting casting that caused a warn on win.
[15 Oct 2007 17:37] Jon Stephens
Documented bugfix in mysql-5.1.22-ndb-6.3.4 changelog. Left in PP status.
[17 Oct 2007 10:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35741

ChangeSet@1.2608, 2007-10-17 13:16:57+03:00, aelkin@dsl-hkibras1-ff5fc300-23.dhcp.inet.fi +2 -0
  bug#20435 wl#342 heartbeat
  
  expanding possible error message from attempt to set up the period
  on the master;
  costemitic change in setting the default value for heartbeat;
[21 Nov 2007 18:10] Lars Thalmann
Except for the telco releases, this will be pushed into 6.0.
[27 Nov 2007 10:48] Bugs System
Pushed into 5.0.54
[27 Nov 2007 10:49] Bugs System
Pushed into 5.1.23-rc
[27 Nov 2007 10:51] Bugs System
Pushed into 6.0.4-alpha
[28 Nov 2007 20:22] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented for 6.0.4, no other mainline releases per comment from Lars.
[15 Feb 2008 13:41] Bugs System
Pushed into 6.0.5-alpha
[14 Mar 2008 7:34] Andrei Elkin
Indeed the hb code was *not* pushed to 5.0 or 5.1 but only to 6.0 tree and earlier to a telco clone.
The reason of the misleading writings in the log is the incorrect commit mail parsing program (not the post-commit trigger but rather a pb component).
[14 Mar 2008 9:33] Jon Stephens
Discussed with Andrei, verified that fix does not appear in 5.0 or 5.1 main, and is not included in any changelogs other than 5.1.22-ndb-6.3.4 and 6.0.4.

No further action required, no change in bug status.
[10 Nov 2010 3:23] Shane Bester
see bug #58103 if you still see this problem.
[10 Nov 2010 10:53] Luis Soares
BUG#58103 was marked as duplicate of this.