Bug #52002 binlog_index may cause incorrect failure report on failing test that follows it
Submitted: 12 Mar 2010 14:57 Modified: 11 Jan 2011 16:47
Reporter: Luis Soares Email Updates:
Status: Closed Impact on me:
None 
Category:Tests Severity:S3 (Non-critical)
Version:5.5.3 OS:Any
Assigned to: Tor Didriksen CPU Architecture:Any
Triage: Triaged: D3 (Medium)

[12 Mar 2010 14:57] Luis Soares
Description:
binlog_index may be causing incorrect failure report on a
subsequent test that fails after binlog_index has executed.

binlog_index causes an intentional server crash, which leaves a
core behind. If the test succeeds but the following test uses the
same mysqld datadir, and it fails, MTR will pick up the core and
show it as part of the test failure.

How to repeat:
I tested this in next-mr-bugfixing, but this may also be in 
trunk-bugfixing (and other versions that have binlog_index test case):

1. bzr clone -r revid:luis.soares@sun.com-20100312124230-oayrhflswin2pcaj $BZR_REPO/mysql-next-mr-bugfixing

2. cd mysql-next-mr-bugfixing

3. ./BUILD/compile-pentium64-debug-max

4. cd mysql-test

4. (FAKE A FAILURE)

   mv r/mysqlbinlog_row .

5. ./mtr --mysqld=--binlog-format=row binlog_index mysqlbinlog_row

  ===> Observe MTR stating that we should call "exit;" at the end
       of the script, *and* surprisingly, there is a backtrace in
       the failure report from the previous succeeding test case,
       which was binlog_index!!
 
       This is confusing and misleading and drives devs in the wrong
       direction when analyzing failures.

Suggested fix:
Perhaps MTR should remove benign cores?
[12 Mar 2010 15:01] Luis Soares
In step #5 from the how to repeat, replacing mysqlbinlog_row with an
mock test case that just fails (binlog_fail.test):

-- source include/have_log_bin.inc
-- fail

$ ./mtr --mysqld=--binlog-format=row  binlog_index binlog_fail

We get the stack trace for a FLUSH LOGS command in the binlog_fail report!
[12 Mar 2010 16:59] Valeriy Kravchuk
Verified just as described with mysql-trunk on Mac OS X.
[16 Mar 2010 15:53] Luis Soares
I was told to add/set in binlog_index-master.opt:

  --skip-core-file

This fixes the particular case of binlog_index. However, I would
like to add two considerations:

  1. --skip-core-file, makes binlog_index skip *any* core,
     whether they are benign or malicious. This means that if the
     test produces an unexpected core dump in the future, it will
     be skipped unconditionally.

     MTR will still report a failure, but we will not report a stack
     trace. This is fine if the hypothetical failure is
     deterministic, but core may be invaluable otherwise.

  2. This is a fix for this particular test case and it does not
     prevent similar situations to arise with other tests that
     force server crashes. However, I think this is user
     responsibility to deploy the correct skip-core-file
     option. Should he have a better way to inform MTR to
     suppress specific core dumps (see item #1) then user should
     probably use it.

Anyway, IMHO, it must be hard for MTR to distinguish good from
bad cores and the effort required may not be worth it. Overall I
am fine with closing the bug with the --skip-core-file in
-master.opt file.
[18 Mar 2010 10:34] Mattias Jonsson
I reported bug#52172 and is fine with the --skip-core-file solution if it also is added in 5.1. I don't think it would be too much hassle to rerun the test if it fails to get the wanted core file, but it is very annoying to get a lot of core files to clean up due to a successful test :)
[21 Sep 2010 15:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118733

3298 Tor Didriksen	2010-09-21
      Bug#52002 binlog_index may cause incorrect failure report on failing test that follows it
      
      For crash testing: kill the server without generating core file.
     @ include/my_dbug.h
        Use kill(getpid(), SIGKILL) which cannot be caught by signal handlers.
     @ sql/binlog.cc
        Kill server without generating core.
     @ sql/handler.cc
        Kill server without generating core.
     @ unittest/gunit/CMakeLists.txt
        Add unit test.
     @ unittest/gunit/dbug-t.cc
        Add unit test.
[21 Sep 2010 16:01] Anitha Gopi
Tor,
Assigning to you as per your discussion with Luis
Anitha
[28 Sep 2010 11:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119256

3300 Tor Didriksen	2010-09-28
      Bug#52002 binlog_index may cause incorrect failure report on failing test that follows it
      
      For crash testing: kill the server without generating core file.
     @ include/my_dbug.h
        Use kill(getpid(), SIGKILL) which cannot be caught by signal handlers.
        All DBUG_XXX macros should be no-ops in optimized mode, do that for DBUG_ABORT as well.
     @ sql/binlog.cc
        Kill server without generating core.
     @ sql/handler.cc
        Kill server without generating core.
     @ unittest/gunit/CMakeLists.txt
        Add unit test.
     @ unittest/gunit/dbug-t.cc
        Add unit test.
[29 Sep 2010 7:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119333

3305 Tor Didriksen	2010-09-29
      Bug#52002 binlog_index may cause incorrect failure report on failing test that follows it
      
      For crash testing: kill the server without generating core file.
     @ include/my_dbug.h
        Use kill(getpid(), SIGKILL) which cannot be caught by signal handlers.
        All DBUG_XXX macros should be no-ops in optimized mode, do that for DBUG_ABORT as well.
     @ sql/binlog.cc
        Kill server without generating core.
     @ sql/handler.cc
        Kill server without generating core.
     @ unittest/gunit/CMakeLists.txt
        Add unit test.
     @ unittest/gunit/dbug-t.cc
        Add unit test.
[1 Oct 2010 9:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119604

3310 Tor Didriksen	2010-10-01
      Bug#52002 binlog_index may cause incorrect failure report on failing test that follows it
      
      For crash testing: kill the server without generating core file.
     @ include/my_dbug.h
        Use kill(getpid(), SIGKILL) which cannot be caught by signal handlers.
        All DBUG_XXX macros should be no-ops in optimized mode, do that for DBUG_ABORT as well.
     @ sql/binlog.cc
        Kill server without generating core.
     @ sql/handler.cc
        Kill server without generating core.
     @ unittest/gunit/CMakeLists.txt
        Add unit test.
     @ unittest/gunit/dbug-t.cc
        Add unit test.
[18 Oct 2010 11:35] Tor Didriksen
pushed as
http://bugs.mysql.com/bug.php?id=52172
to
5.1-bugteam
5.5-bugteam
trunk-merge
[11 Jan 2011 6:49] Tor Didriksen
This only affects internal testing.
[11 Jan 2011 16:47] Paul Dubois
No changelog entry needed.