Bug #39325 Server crash inside MYSQL_LOG::purge_first_log halts replicaiton
Submitted: 8 Sep 2008 18:40 Modified: 28 Jan 2009 12:41
Reporter: Justin Tolmer Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.0,5.1 OS:Any
Assigned to: Luis Soares CPU Architecture:Any
Tags: purge, relay, replication

[8 Sep 2008 18:40] Justin Tolmer
Description:
This is similar to bug #38826, but in a different function.

In MYSQL_LOG::purge_first_log the order of operations is to first call MYSQL_LOG::purge_log(), which deletes the relay log and then updates the relay index, and then to call flush_relay_log_info(), If the slave dies before the call to flush_relay_log_info(), on restart of the server, replication stops with an error:

080908 11:29:30 [ERROR] Failed to open the relay log '/export/hda3/tmp/test/slave0/log/slave-relay-bin.000004' (relay_log_pos 537)
080908 11:29:30 [ERROR] Could not find target log during relay log initialization
080908 11:29:30 [ERROR] Failed to initialize the master info structure
080908 11:29:30 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.66a-DEBUGGING-debug-log'  socket: '/export/hda3/tmp/test/slave0/db.sock'  port: 11307  Google MySQL 5.0.37+STATS TESTING build
5086.debug
080908 11:29:34 [ERROR] Failed to open the relay log '/export/hda3/tmp/test/slave0/log/slave-relay-bin.000004' (relay_log_pos 537)
080908 11:29:34 [ERROR] Could not find target log during relay log initialization

How to repeat:
Kill the slave when it is in MYSQL_LOG::purge_first_log(). In order to consistently repeat the error, I used:

  DBUG_EXECUTE_IF("crash_before_flush_relay_log_info", abort(););

just before the call to flush_relay_log_info().

Suggested fix:
Update relay-log.info prior to calling MYSQL_LOG::purge_logs().
[10 Sep 2008 20:12] Sveta Smirnova
Thank you for the report.

What is the difference between problem you describe and bug #38826? I mean difference because which you opened new report?
[10 Sep 2008 22:00] Justin Tolmer
This bug is in purge_first_log() and deals with the fact that relay-log.info isn't correct after the crash recovery. Bug #38826 is in purge_logs() and deals with the fact that the index file isn't correct after the crash recovery.
[11 Sep 2008 9:21] Sveta Smirnova
Thank you for the report.

Verified as described:

modify sql/log.cc, make some replication test with correct slave options (--debug="+d,crash_before_flush_relay_log_info"), examine content of var/log directory,  slave-relay-bin.index and relay-log.info files.
[29 Sep 2008 20:49] Justin Tolmer
Patch with a fix for the issue

Attachment: 39325.diff (text/plain), 3.91 KiB.

[7 Oct 2008 10:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55559

2670 Luís Soares	2008-10-07
      BUG#39325: In MYSQL_LOG::purge_first_log the order of operations is to first call
      MYSQL_LOG::purge_log(), which deletes the relay log and then updates the relay index, and
      then to call flush_relay_log_info(), If the slave dies before the call to
      flush_relay_log_info(), on restart of the server, replication stops with an error.
      
      This commits the test case to reproduce the deffective behavior.
[28 Oct 2008 15:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/57226

2684 Luís Soares	2008-10-28
      In purge_first_log: flushes first the relay-log.info and only then purges the logs. Fix BUG#39325.
[4 Nov 2008 0:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/57733

2709 Luis Soares	2008-11-04
      This is a combined patch for: BUG#38826 and BUG#39325.
      It merges the previous two proposed patches into a single one.
[4 Nov 2008 0:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/57734

2690 Luis Soares	2008-11-04
      This is a combined patch for: BUG#38826 and BUG#39325.
      It merges the previous two proposed patches into a single one.
      
      Merge with 5.1-rpl team tree.
[4 Nov 2008 12:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/57781

2690 Luis Soares	2008-11-04
      This is a combined patch for: BUG#38826 and BUG#39325.
      It merges the previous two proposed patches into a single one.
            
      Merge with 5.1-rpl team tree.
      
      The difference from previous patch is that it calls 
      "ha_binlog_index_purge_file" before purging the relay log file.
[4 Dec 2008 1:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/60558

2731 Luis Soares	2008-12-04
      BUG#38826 Race in MYSQL_LOG::purge_logs is impossible to debug in production
      BUG#39325 Server crash inside MYSQL_LOG::purge_first_log halts replicaiton
      
      The patch reverses the order of the purging and updating events for log and relay-log.info/index files respectively.
      This solves the problem of having holes caused by crashes happening between updating info/index files and purging logs.
      
      NOTE: This is a combined patch for BUG#38826 and BUG#39325. This patch is based on bugteam tree and takes into account reviewers suggestions.
[4 Dec 2008 14:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/60608

2720 Luis Soares	2008-12-04 [merge]
      BUG#38826 Race in MYSQL_LOG::purge_logs is impossible to debug in production
      BUG#39325 Server crash inside MYSQL_LOG::purge_first_log halts replicaiton
      
      The patch reverses the order of the purging and updating events for log and
      relay-log.info/index files respectively.
      
      This solves the problem of having holes caused by crashes happening between updating
      info/index files and purging logs.
      
      This patch also contains an aditional test case for testing the crashing before purge logs.
            
      NOTE1: This is a combined patch for BUG#38826 and BUG#39325. This patch is based on
      bugteam tree and takes into account reviewers suggestions.
      
      NOTE2: Merge from 5.0-bugteam
[10 Dec 2008 10:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/61188

2792 Luis Soares	2008-12-10 [merge]
      BUG#38826 Race in MYSQL_LOG::purge_logs is impossible to debug in production
      BUG#39325 Server crash inside MYSQL_LOG::purge_first_log halts replicaiton
            
      The patch reverses the order of the purging and updating events for log and relay-log.info/index files respectively.
            
      This solves the problem of having holes caused by crashes happening between updating info/index files and purging logs.
            
      This patch also contains an aditional test case for testing the crashing before purge logs.
                  
      NOTE1: This is a combined patch for BUG#38826 and BUG#39325. This patch is based on bugteam tree and takes into account reviewers suggestions.
            
      NOTE2: Merge from 5.1-bugteam
[6 Jan 2009 13:56] Bugs System
Pushed into 5.0.76 (revid:joro@sun.com-20090105160414-8q9j4bi1klkfwiup) (version source revid:azundris@mysql.com-20081230114734-nmsc37ak330zlygn) (merge vers: 5.0.76) (pib:6)
[7 Jan 2009 17:28] Jon Stephens
Documented fix for this bug along with fix for Bug #38826 in 5.0.76 changelog.

Set bug status to Patch Pending awaiting push to 5.1 tree.
[15 Jan 2009 6:33] Bugs System
Pushed into 5.1.31 (revid:joro@sun.com-20090115053147-tx1oapthnzgvs1ro) (version source revid:azundris@mysql.com-20081230114838-cn52tu180wcrvh0h) (merge vers: 5.1.31) (pib:6)
[19 Jan 2009 11:30] Bugs System
Pushed into 5.1.31-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090119095303-uwwvxiibtr38djii) (version source revid:tomas.ulin@sun.com-20090115073240-1wanl85vlvw2she1) (merge vers: 5.1.31-ndb-6.2.17) (pib:6)
[19 Jan 2009 13:08] Bugs System
Pushed into 5.1.31-ndb-6.3.21 (revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (version source revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (merge vers: 5.1.31-ndb-6.3.21) (pib:6)
[19 Jan 2009 16:13] Bugs System
Pushed into 5.1.31-ndb-6.4.1 (revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (version source revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (merge vers: 5.1.31-ndb-6.4.1) (pib:6)
[20 Jan 2009 18:59] Bugs System
Pushed into 6.0.10-alpha (revid:joro@sun.com-20090119171328-2hemf2ndc1dxl0et) (version source revid:azundris@mysql.com-20081230114916-c290n83z25wkt6e4) (merge vers: 6.0.9-alpha) (pib:6)
[28 Jan 2009 12:41] Jon Stephens
Fix also documented in the 5.1.31 and 6.0.10 changelogs. Closed.