Bug #24192 MySQL replication does not exit server when running out of memory
Submitted: 10 Nov 2006 14:46 Modified: 27 Jul 2007 4:21
Reporter: SINISA MILIVOJEVIC Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:4.1,5.0,5.1 OS:Any (all)
Assigned to: Alexey Kopytov CPU Architecture:Any
Tags: bfsm_2006_12_07

[10 Nov 2006 14:46] SINISA MILIVOJEVIC
Description:
When reading events from master, an out-of-memory condition may occur. In such situations, when further functioning is impossible, replication does not even stop. It should, however, stop and exit entire MySQL server, which is what is done in the most of the rest of the code, as this is very critical error.

Instead, operation continues. When OOM condition is coupled with I/O error on network, that can lead to the wrong calculus of the next binary log position on the master.

How to repeat:
All logs are in the issue.

Suggested fix:
Check the condition and exit a server.
[15 Jun 2007 12:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/28864

ChangeSet@1.2470, 2007-06-15 16:15:15+04:00, kaa@polly.local +1 -0
  Fix for bug #24192 "MySQL replication does not exit server when running out of memory"
  
  In case of out-of-memory error received from the master, print the corresponding message to the error log and stop slave I/O thread to avoid reconnecting with a wrong binary log position.
[22 Jun 2007 10:06] Alexey Kopytov
An update for future references: this bug is triggered neither by an out-of-memory condition on slave, nor by an I/O error. The "server_errno=5" which was incorrectly interpreted as in I/O error, was actually EE_OUTOFMEMORY returned by my_malloc() on master.

The real reason is an OOM on master which causes slave I/O thread to reconnect with a wrong master binary log position.
[22 Jun 2007 11:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29377

ChangeSet@1.2470, 2007-06-22 14:08:22+04:00, kaa@polly.local +1 -0
  Fix for bug #24192 "MySQL replication does not exit server when running out of memory"
  
  In case of out-of-memory error received from the master, print the corresponding message to the error log and stop slave I/O thread to avoid reconnecting with a wrong binary log position.
[11 Jul 2007 14:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30701

ChangeSet@1.2470, 2007-07-11 18:38:45+04:00, kaa@polly.local +1 -0
  Fix for bug #24192 "MySQL replication does not exit server when running out of memory"
  
  In case of out-of-memory error received from the master, print the corresponding message to the error log and stop slave I/O thread to avoid reconnecting with a wrong binary log position.
[11 Jul 2007 14:39] Andrei Elkin
Emulation of OOM condition appeared to be not an easy task. After speaking to Alexey where he explained that it's very hard to achieve deterministic behavour on  slave even though master's side will become out of memory.
Thereafter the current patch gets approved.
[19 Jul 2007 15:48] Bugs System
Pushed into 5.1.21-beta
[19 Jul 2007 15:49] Bugs System
Pushed into 5.0.48
[27 Jul 2007 4:21] Paul DuBois
Noted in 5.0.48, 5.1.21 changelogs.

Slave servers could incorrectly interpret an out-of-memory error from
the master and reconnect using the wrong binary log position.