Bug #13377 After RESET MASTER, the delete of "master-bin.000001" fails on Windows
Submitted: 21 Sep 2005 13:06 Modified: 23 Nov 2005 23:53
Reporter: Kent Boortz Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.0.13-pre OS:Windows (Windows)
Assigned to: CPU Architecture:Any
Tags: binlog

[21 Sep 2005 13:06] Kent Boortz
Description:
Running several replication test cases, like 'rpl000001', will output a diff

*** r/rpl000001.result  Wed Sep 21 05:53:09 2005
--- r/rpl000001.reject  Wed Sep 21 13:08:56 2005
***************
*** 35,40 ****
--- 35,42 ----
  drop table t1,t3;
  create table t1 (n int) engine=myisam;
  reset master;
+ Warnings:
+ Error 6       Error on delete of 'C:\cygwin\home\mysqldev\build\mysql-5.0.13-rc-build-pro\mysql-pro-5.0.13-rc-win32\mysql-test\var\log\master-bin.000001' (Errcode: 13)
  stop slave;
  reset slave;
  lock tables t1 read;

How to repeat:
Run the test suite on Windows

  ./mysql-test-run.pl rpl000001

Suggested fix:
The likely cause is that the file is still open when the server
try to delete it. The solution is to close it first.
[21 Sep 2005 13:24] Guilhem Bichot
Kent,
You could talk to S. Vlasenko about it, as he knows this problem and has asked me questions about it, and was working on it at least a few days ago. I'm not a bug assigner, but I think it should be assigned to him. And the solution is not obvious by the way - the file is opened by *another* thread, that's why the deletion fails.
[21 Sep 2005 18:58] MySQL Verification Team
Finding  Tests in the 'main' suite
Starting Tests in the 'main' suite

TEST                            RESULT
-------------------------------------------------------

rpl000001                       [ fail ]
Errors are (from D:/cygwin/home/miguel/mysql/mysql-test/var/log/mysqltest-time)
:
[3 Oct 2005 16:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/30648
[17 Oct 2005 15:36] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/31183
[17 Oct 2005 15:42] Sergey Vlasenko
Patch is available in 5.0.16
[20 Oct 2005 7:57] Jon Stephens
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Documented fix in 5.0.16 changelog. Closed.
[20 Oct 2005 8:15] Guilhem Bichot
Sergey,
I'm setting to "in-progress" because in your pushed patch, as we discussed, there are a couple of places where rli->* variables are set without mutex and this is going to lead to bugs (crash etc) for sure.
[21 Oct 2005 21:50] Guilhem Bichot
Per Elliot I'll do the review of the new fixes when they are ready.
[24 Oct 2005 14:52] Guilhem Bichot
New patch received from Sergey, new review comments sent to him.
[26 Oct 2005 21:43] Guilhem Bichot
I have done a first review of the patch and still had comments, Sergei Golubchik is currently doing a 2nd review to decide what is best for the patch.
If absolutely needed because of the 5.0.16 deadline, we could split the patch into two parts (but check with Sergei Golubchik first to see if this does not collide with his review):
1) patch to slave.cc, which closes the crash and is already approved (by me), and so can be pushed right away:
--- 1.260/sql/slave.cc  2005-10-19 01:43:14 +04:00
+++ 1.261/sql/slave.cc  2005-10-21 17:17:00 +04:00
@@ -582,6 +582,14 @@

   rli->slave_skip_counter=0;
   pthread_mutex_lock(&rli->data_lock);
+
+  if (rli->cur_log_fd >= 0)
+  {
+    end_io_cache(&rli->cache_buf);
+    my_close(rli->cur_log_fd, MYF(MY_WME));
+    rli->cur_log_fd= -1;
+  }
+
   if (rli->relay_log.reset_logs(thd))
   {
     *errmsg = "Failed during log reset";
@@ -3692,14 +3700,6 @@
   mi->slave_running = 0;
   mi->io_thd = 0;

-  /* Close log file and free buffers */
-  if (mi->rli.cur_log_fd >= 0)
-  {
-    end_io_cache(&mi->rli.cache_buf);
-    my_close(mi->rli.cur_log_fd, MYF(MY_WME));
-    mi->rli.cur_log_fd= -1;
-  }
-
   /* Forget the relay log's format */
   delete mi->rli.relay_log.description_event_for_queue;
   mi->rli.relay_log.description_event_for_queue= 0;
@@ -3915,14 +3915,6 @@
   /* we die so won't remember charset - re-update them on next thread start */
   rli->cached_charset_invalidate();
   rli->save_temporary_tables = thd->temporary_tables;
-
-  /* Close log file and free buffers if it's already open */
-  if (rli->cur_log_fd >= 0)
-  {
-    end_io_cache(&rli->cache_buf);
-    my_close(rli->cur_log_fd, MYF(MY_WME));
-    rli->cur_log_fd = -1;
-  }

   /*
     TODO: see if we can do this conditionally in next_event() instead

2) patch for other files (sql_repl.cc, log.cc, sql_class.h etc), which is less critical (no crash, just coding issues), which is where I have observations and which is still being discussed between Sergei and Sergey now.
[3 Nov 2005 15:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/31889
[3 Nov 2005 16:30] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/31892
[14 Nov 2005 13:05] Sergey Vlasenko
Patch is available in 5.0.17
[14 Nov 2005 22:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/32237
[14 Nov 2005 22:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/32243
[23 Nov 2005 23:53] Paul DuBois
Moved the changelog entry from 5.0.16 to 5.0.17.