Bug #30901 Test environment on Windows not properly reconnecting after a crash
Submitted: 7 Sep 2007 15:23 Modified: 18 Apr 2008 12:56
Reporter: Chuck Bell Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Tests Severity:S3 (Non-critical)
Version:5.0.48, 5.1.22 OS:Any
Assigned to: Magnus BlÄudd CPU Architecture:Any
Triage: D3 (Medium)

[7 Sep 2007 15:23] Chuck Bell
Description:
The test environment does not correctly reconnect to the server once an expected crash has occurred.

This problem was found while creating the patch for BUG#26395. The test rpl_half_binlog_trans was modeled after the commit_crash_before test. 

What happens, it appears, is the test environment is not properly restarting the master after the crash. The result is the row on the slave is not rolled back presumably because it has not reestablished connection with the master. 

Perhaps the worst part is this doesn't happen every time you run the rpl_half_binlog_trans test on Windows. Sometimes the test runs fine and passes. Other times it produces the wrong results and on at least rare occasions fails to connect to the slave resulting in a cannot connect error.

The typical (failed) test results are:

rpl.rpl_half_binlog_trans      [ fail ]

Errors are (from c:/source/c++/mysql-5.1_BUG_26395/mysql-test/var/log/mysqltest-
time) :
mysqltest: Result length mismatch
(the last lines may be the most important ones)
Below are the diffs between actual and expected results:
-------------------------------------------------------
*** c:/source/c++/mysql-5.1_BUG_26395/mysql-test/suite/rpl/r/rpl_half_binlog_tra
ns.result       Fri Sep  7 16:52:28 2007
--- c:/source/c++/mysql-5.1_BUG_26395/mysql-test/suite/rpl/r/rpl_half_binlog_tra
ns.reject       Fri Sep  7 17:25:25 2007
***************
*** 22,27 ****
--- 22,28 ----
  SELECT on slave
  SELECT * FROM t;
  a
+ 1                        <---- this row should roll back.
  DROP TABLE IF EXISTS t;
  Cleanup
  DROP TABLE IF EXISTS t;
-------------------------------------------------------

The test runs fine on Linux in both v5.0 and v5.1.

How to repeat:
Remove the --source include/not_windows.inc from the rpl_half_binlog_trans test and run on Windows. May need to run multiple iterations.

example:

./mysql-test-run.pl --force rpl_half_binlog_trans rpl_half_binlog_trans rpl_half_binlog_trans rpl_half_binlog_trans rpl_half_binlog_trans rpl_half_binlog_trans rpl_half_binlog_trans rpl_half_binlog_trans rpl_half_binlog_trans rpl_half_binlog_trans rpl_half_binlog_trans 

Suggested fix:
Change test suite to properly reconnect and restart the server on Windows. Remove the Windows restriction from the rpl_half_binlog_trans test when fixed.
[7 Sep 2007 16:19] Chuck Bell
I have just confirmed that if you run the test enough times on Linux it will eventually fail with the same problem that occurs on Windows. Thus, the test suite runs better on Linux but will eventually have the problem where the slave is not reconnected to the master properly and the row on the slave is not rolled back.
[10 Sep 2007 22:47] Miguel Solorzano
Thank you for the bug report.
[5 Mar 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[18 Mar 2008 12:56] Susanne Ebrecht
Chuck,

we are still waiting for feedback.
[18 Apr 2008 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".