Bug #28597 Replication doesn't start after upgrading to 5.1.18
Submitted: 22 May 2007 15:10 Modified: 28 Nov 2007 17:24
Reporter: Stefan Hinz Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.1.18 OS:Any
Assigned to: Andrei Elkin
Tags: logs_path

[22 May 2007 15:10] Stefan Hinz
Description:
After upgrading a MySQL master and its slave servers from 5.1.16 to 5.1.18, replication doesn't start again. I got that problem on Linux but it's likely to show up on other operating systems, too.

The slave servers' error logs contain something like this:

070516 18:00:53 [ERROR] Error reading packet from server: Could not find first log file name in binary log index file ( server_errno=1236)
070516 18:00:53 [ERROR] Got fatal error 1236: 'Could not find first log file name in binary log index file' from master when reading data from binary log
070516 18:00:53 [Note] Slave I/O thread exiting, read up to log 'iconnect2-bin.000083', position 138111
070516 18:11:13 [Note] Error reading relay log event: slave SQL thread was killed

The binlog index file ends like this:

...
./themis-bin.000074
./themis-bin.000075
/var/lib/mysql/themis-bin.000076
/var/lib/mysql/themis-bin.000077
/var/lib/mysql/themis-bin.000078
/var/lib/mysql/themis-bin.000079
/var/lib/mysql/themis-bin.000080
/var/lib/mysql/themis-bin.000081

Apparently, the new format for recording binlog files, using an absolute path rather than a relative path, is causing that problem.

Workaround: After manually changing the last line recorded by the "old" (5.1.16) server from:

./themis-bin.000075

to:

/var/lib/mysql/themis-bin.000075

and restarting the slave server, replication resumes with no issues.

How to repeat:
Upgrade a 5.1.16 slave server to 5.1.18 and see how replication doesn't restart after starting the slave server.

Suggested fix:
1) Switch back to the old recording format in the binlog index file.
2) Document the issue and instruct users about workaround. (Mail sent to documentation team.)
[25 May 2007 16:00] Stefan Hinz
I was using the RPM files from http://dev.mysql.com/get/Downloads/MySQL-5.1/MySQL-server-5.1.18-0.glibc23.i386.rpm/from/p..., updating my 5.1.16 installation with "rpm -Uhv MySQL-server-....rpm".
[3 Jul 2007 8:00] Sveta Smirnova
Thank you for the report.

Please provide your master configuration file.
[3 Jul 2007 9:11] Sveta Smirnova
Thank you for the report.

Verified as described.

To repeat don't indicate binary log basename in configuration file: just type log-bin.
Good configuration file to repeat this bug is standard my-small.cnf:

$cat ../../mysql-5.1.16-beta-linux-i686-glibc23/data/host-bin.index 
./host-bin.000001
/users/ssmirnova/mysql-5.1.16-beta-linux-i686-glibc23/data/host-bin.000002
/users/ssmirnova/mysql-5.1.16-beta-linux-i686-glibc23/data/host-bin.000003
/users/ssmirnova/mysql-5.1.16-beta-linux-i686-glibc23/data/host-bin.000004
[19 Oct 2007 12:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35912

ChangeSet@1.2578, 2007-10-19 15:02:10+03:00, aelkin@dsl-hkibras1-ff5fc300-23.dhcp.inet.fi +1 -0
  Bug #28597 Replication doesn't start after upgrading to 5.1.18
  
  Since bug@20166 that replaced the binlog file name generating to favor pidfile_name instead of 
  the previous glob_hostname generated names started to be written in the absolute path format.
  
  Fixed with stripping off the directory part of a generated name that also terminates a pending TODO.
[21 Oct 2007 14:33] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35992

ChangeSet@1.2543, 2007-10-21 17:32:39+03:00, aelkin@dsl-hkibras1-ff5fc300-23.dhcp.inet.fi +1 -0
  Bug #28597 Replication doesn't start after upgrading to 5.1.18
  
  Since bug@20166 that replaced the binlog file name generating to base on pidfile_name instead of 
  the previous glob_hostname the binlog file name generating and its storing into the binlog index
  suddenly started to be solely in the absolute path format,
  including a case when --log-bin option meant a relative path.
  
  The current algorithm for master is that when master reads the index it constucts the ultimate path 
  for accessing the file basing on --log-bin option value. 
  So if the option means a relative path, any value that is read from the binlog index is
  treated as if it is in a relative path format.
  
  So the combination of the existing algorithm which does not concern with a format of name it reads from
  the binlog index file and the earlier bug change led to the current bug.
  
  Fixed with reverting back to use glob_hostname which means that the names are going to be written
  in a relative path format, as before, if --log-bin means a relative path.
[24 Oct 2007 18:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/36298

ChangeSet@1.2543, 2007-10-24 21:38:47+03:00, aelkin@koti.dsl.inet.fi +1 -0
  Bug #28597 Replication doesn't start after upgrading to 5.1.18
  
  Since bug@20166, which replaced the binlog file name generating to base on pidfile_name instead of 
  the previous glob_hostname, the binlog file name generating and its storing into the binlog index
  suddenly started to be solely in the absolute path format,
  including a case when --log-bin option meant a relative path.
  
  An existed algorithm on master to find a requested binlog file for slave was limited to compare only
  homogenous paths either absolute or relative but never mixed. 
  In effect of two facts if --log-bin value means a relative path, comparison on the buggy 5.1.18 of an
  allways absolute value (read from the index) with a relave  was doomed.
  
  Fixed with leaving `pidfile_name' but stripping off its directory part that restores the original logics
  of storing the names in compatible with --log-bin option format.
  The comparison algorithm is refined to be able match any format two paths converting them to the absolute
  paths.
[25 Oct 2007 9:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/36322

ChangeSet@1.2543, 2007-10-25 12:05:57+03:00, aelkin@koti.dsl.inet.fi +2 -0
  Bug #28597 Replication doesn't start after upgrading to 5.1.18
  
  Since bug@20166, which replaced the binlog file name generating to base on pidfile_name instead of 
  the previous glob_hostname, the binlog file name generating and its storing into the binlog index
  suddenly started to be solely in the absolute path format,
  including a case when --log-bin option meant a relative path.
  
  An existed algorithm on master to find a requested binlog file for slave was limited to compare only
  homogenous paths either absolute or relative but never mixed. 
  In effect of two facts if --log-bin value means a relative path, comparison on the buggy 5.1.18 of an
  allways absolute value (read from the index) with a relave  was doomed.
  
  Fixed with leaving `pidfile_name' but stripping off its directory part that restores the original logics
  of storing the names in compatible with --log-bin option format.
  The comparison algorithm is refined to be able match any format two paths converting them to the absolute
  paths.
  
  Two side effects for this fix:
  
  correcting bug#27070;
  ensuring no overrun for buff can happen anymore (Bug #31836 
  insufficient space reserved for the suffix of relay log file name);
[25 Oct 2007 9:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/36324

ChangeSet@1.2543, 2007-10-25 12:21:46+03:00, aelkin@koti.dsl.inet.fi +3 -0
  Bug #28597 Replication doesn't start after upgrading to 5.1.18
  
  Since bug@20166, which replaced the binlog file name generating to base on pidfile_name instead of 
  the previous glob_hostname, the binlog file name generating and its storing into the binlog index
  suddenly started to be solely in the absolute path format,
  including a case when --log-bin option meant a relative path.
  
  An existed algorithm on master to find a requested binlog file for slave was limited to compare only
  homogenous paths either absolute or relative but never mixed. 
  In effect of two facts if --log-bin value means a relative path, comparison on the buggy 5.1.18 of an
  allways absolute value (read from the index) with a relave  was doomed.
  
  Fixed with leaving `pidfile_name' but stripping off its directory part that restores the original logics
  of storing the names in compatible with --log-bin option format.
  The comparison algorithm is refined to be able match any format two paths converting them to the absolute
  paths.
  
  Two side effects for this fix:
  
  correcting bug#27070;
  ensuring no overrun for buff can happen anymore (Bug#31836 
  insufficient space reserved for the suffix of relay log file name);
  bug#31837  --remove_file $MYSQLTEST_VARDIR/tmp/bug14157.sql missed in rpl_temporary.test;
[29 Oct 2007 11:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/36542

ChangeSet@1.2543, 2007-10-29 13:47:25+02:00, aelkin@koti.dsl.inet.fi +3 -0
  Bug #28597 Replication doesn't start after upgrading to 5.1.18
  
  Since bug@20166, which replaced the binlog file name generating to base
  on pidfile_name instead of the previous glob_hostname, the binlog file
  name generating and its storing into the binlog index
  suddenly started to be solely in the absolute path format,
  including a case when --log-bin option meant a relative path.
  
  An existed algorithm on master to find a requested binlog file for slave
  was limited to compare only homogenous paths either absolute or relative
  but never mixed. 
  In effect of two facts if --log-bin value means a relative path,
  comparison on the buggy 5.1.18 of an allways absolute value (read from
  the index) with a relative  was doomed.
  
  Fixed with leaving `pidfile_name' but stripping off its directory part
  that restores the original logics of storing the names in compatible
  with --log-bin option format.
  The comparison algorithm is refined to be able to match any format two
  paths converting them to the absolute paths.
  
  Side effects for this fix:
  
  correcting bug#27070;
  ensuring no overrun for buff can happen anymore (Bug#31836 
  insufficient space reserved for the suffix of relay log file name);
  bug#31837  --remove_file $MYSQLTEST_VARDIR/tmp/bug14157.sql missed
  in rpl_temporary.test;
  fixes Bug@28603  Invalid log-bin default location;
  a minor issue with the malformed line noted in log.cc;
[2 Nov 2007 9:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/36956

ChangeSet@1.2543, 2007-11-02 11:15:05+02:00, aelkin@koti.dsl.inet.fi +3 -0
  Bug #28597 Replication doesn't start after upgrading to 5.1.18
  
  Since bug@20166, which replaced the binlog file name generating to base
  on pidfile_name instead of the previous glob_hostname, the binlog file
  name suddenly started to be stored solely in the absolute path format,
  including a case when --log-bin option meant a relative path.
  What's more serious, the path for binlog file can lead unrequestedly 
  to pid-file directory so that after any proper fix for this bug
  there might be similar to the bug report consequences for one who
[5 Nov 2007 14:31] Sergei Golubchik
Ok to push without MY_UNPACK_FILENAME and with a shorter changeset
comment. http://lists.mysql.com/commits/37092
[5 Nov 2007 15:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/37098

ChangeSet@1.2543, 2007-11-05 17:20:10+02:00, aelkin@koti.dsl.inet.fi +3 -0
  Bug #28597 Replication doesn't start after upgrading to 5.1.18
  
  Since bug@20166, which replaced the binlog file name generating to base
  on pidfile_name instead of the previous glob_hostname, the binlog file
  name suddenly started to be stored solely in the absolute path format,
  including a case when --log-bin option meant a relative path.
  What's more serious, the path for binlog file can lead unrequestedly 
  to pid-file directory so that after any proper fix for this bug
  there might be similar to the bug report consequences for one who
[27 Nov 2007 10:48] Bugs System
Pushed into 5.0.54
[27 Nov 2007 10:51] Bugs System
Pushed into 5.1.23-rc
[27 Nov 2007 10:53] Bugs System
Pushed into 6.0.4-alpha
[28 Nov 2007 17:24] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented bugfix in the 5.0.54, 5.1.23, and 6.0.4 changelogs as follows:

        Due a previous change in how the default name and location of
        the binlog file were determined, replication failed following
        some upgrades.
[11 Dec 2007 19:53] Norbert Tretkowski
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-5968

Is this bug a security issue? If so, please document it.
[11 Dec 2007 20:51] Sergei Golubchik
No, it's not a security issue. The url in the CVE entry is wrong, it points to the unrelated bug. I've asked to correct it.
[12 Dec 2007 9:06] Norbert Tretkowski
Which one is the correct bug?
[12 Dec 2007 11:43] Sergei Golubchik
it's BUG#31611
which is (unsurprisingly) marked private until the fixed version is released.
[21 May 2008 14:17] Arjen Lentz
See http://arjen-lentz.livejournal.com/115899.html for more background info and a workaround through the my.cnf.

Note also that Ubuntu has it fixed in its 5.0.51a build through a backport from the 5.0.54 commit @ MySQL.