Bug #44055 MTR2: check-testcase is wrong when a previous test was skipped
Submitted: 2 Apr 2009 21:05 Modified: 11 Oct 2010 14:09
Reporter: Guilhem Bichot Email Updates:
Status: Duplicate Impact on me:
None 
Category:Tools: MTR / mysql-test-run Severity:S3 (Non-critical)
Version:6.0-bzr OS:Linux
Assigned to: Bjørn Munch CPU Architecture:Any
Tags: pushbuild, sporadic, test failure

[2 Apr 2009 21:05] Guilhem Bichot
Description:
./mtr --mem --force --max-test-fail=0 --ps-protocol --mysqld=--binlog-format=row --no-reorder rpl.rpl_blackhole rpl.rpl_bit_npk --retry-failure=0
rpl.rpl_blackhole                        [ skipped ]  Test requires: 'true'
rpl.rpl_bit_npk                          [ pass ]    168

MTR's internal check of the test case 'rpl.rpl_bit_npk' failed.
This means that the test case does not preserve the state that existed
before the test case was executed.  Most likely the test case did not
do a proper clean-up.
This is the diff of the states of the servers before and after the
test case was executed:
mysqltest: Logging to '/home/mysql_src/bzrrepos/mysql-6.0-maria/mysql-test/var/tmp/check-mysqld_1.log'.
mysqltest: Results saved in '/home/mysql_src/bzrrepos/mysql-6.0-maria/mysql-test/var/tmp/check-mysqld_1.result'.
mysqltest: Connecting to server localhost:10030 (socket /home/mysql_src/bzrrepos/mysql-6.0-maria/mysql-test/var/tmp/mysqld.1.sock) as 'root', connection 'default', attempt 0 ...
mysqltest: ... Connected.
mysqltest: Start processing test commands from './include/check-testcase.test' ...
mysqltest: ... Done processing test commands.
--- /home/mysql_src/bzrrepos/mysql-6.0-maria/mysql-test/var/tmp/check-mysqld_1.result   2009-04-02 23:56:49.906689295 +0300
+++ /home/mysql_src/bzrrepos/mysql-6.0-maria/mysql-test/var/tmp/check-mysqld_1.reject   2009-04-02 23:56:50.294713545 +0300
@@ -319,7 +319,6 @@
 def    mysql   latin1  latin1_swedish_ci       NULL
 def    test    latin1  latin1_swedish_ci       NULL
 tables_in_test
-t1
 tables_in_mysql
 mysql.backup_history
 mysql.backup_progress

So, when the second test starts there exists a table named t1. How can this be, as the previous test (blackhole) has been skipped?
My little debugging suggests that the previous test has been skipped, ok, but only after its first "CREATE TABLE t1" was executed: I added --gdb to the mtr command line, put a breakpoint in mi_create(), and can see that the breakpoint is hit for a table t1 while mysqltest is running the rpl_blackhole test (I see what test mysqltest is doing by using "ps"). Or, another way to see it: I remove --gdb and add --debug; in the debug trace I see
prep_query: CREATE TABLE t1 (a INT, b INT, c INT)
which is the first query of rpl_blackhole which normally happens after the check for blackhole's availability (so how come this query was run?). After executing this query, apparently the next query is the first one of rpl_bit_npk.
So, it looks like rpl_blackhole is allowed to execute its first CREATE TABLE then is terminated, and that's why rpl_bit_npk finds "t1".

How to repeat:
see description

Suggested fix:
I wonder if it could have to do with BUG#40298
[6 Oct 2010 11:33] Bjørn Munch
To avoid this, MTR probably has to always restart servers after a test founds out it has to be skipped. Bug #52828 implements a special case for this.

Could perhaps find a way for a test to indicate it skipped "early" before having done anything.
[11 Oct 2010 13:45] Bjørn Munch
This was in fact caused by a bug in include/have_blackhole.inc:

-----
disable_query_log;
--require r/true.require
let $have_blackhole=`select (support = 'YES' or support = 'DEFAULT') as `TRUE` 
                   from information_schema.engines where engine = 'blackhole'`;
if (!$have_blackhole)
{
  skip Test needs the Blackhole storage engine;
}
enable_query_log;
-----

The test using the $have_blackhole variable works, but the --require (which should have been removed) is left dangling. It triggers on the first actual SQL statement, which is the CREATE. This causes the test to be skipped even if we do have the blackhole engine, because the CREATE does not output "TRUE 1"

Since this code is in 6.0 only (not in current trunk), I close this as Won't Fix. If it should be fixed, the bug is in the .inc file, not in MTR.
[11 Oct 2010 14:09] Bjørn Munch
Turns out this was actually fixed by Bug #42981.