Bug #19908 Test suite does not clean up after IM-test failure
Submitted: 18 May 2006 11:10 Modified: 18 Aug 2006 13:39
Reporter: Joerg Bruehe Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Tests Severity:S3 (Non-critical)
Version:5.0.20 and up, 5.1.10-beta and up OS:Unix (various)
Assigned to: Alexander Nozdrin CPU Architecture:Any

[18 May 2006 11:10] Joerg Bruehe
Description:
Test build of 5.1.10-beta.

On the machines "nocona", "pegasos3", and "sol9x86", 
test "im_options_unset" fails (bug#18020 + bug#18027).

When this happens, the subsequent "im_utils" may fail or pass,
but then _all_ subsequent "index_merge*" and "information_schema*" tests fail with this symptom:
=====
index_merge                    [ fail ]

Errors are (from /PATH/mysqltest-time) :
mysqltest: Could not open connection 'default': 2002 Can't connect to local MySQL server through socket '/PATH/master.sock' (2)
(the last lines may be the most important ones)

Killing Possible Leftover Processes
=====

Exactly the same symptom for all these tests:
index_merge
index_merge_bdb
index_merge_innodb2
index_merge_innodb
index_merge_ror
index_merge_ror_cpk
information_schema
information_schema_db
information_schema_inno
information_schema_part

The exact occurrences:
nocona-icc-glibc23-5.1-community.log   debug
pegasos3-glibc23-5.1-community.log   debug
pegasos3-glibc23-5.1-community.log   normal
pegasos3-glibc23-5.1-community.log   normal+rowrepl
sol9x86-5.1-community.log   normal+rowrepl
These are exactly in sync with the failing "im_options_unset",
regardless of the result of the in-between "im_utils".

Tests start to pass again with "init_connect", in all these runs.

(Similar failures are in 5.1.9 and from 5.0.20 onwards, 
but they have not yet been checked for the correlation described above.)

How to repeat:
Detected by the test suite.

Suggested fix:
Check whether "init_connect" contains some special reset actions,
transfer them to the test suite script directly.

Check whether this is also applicable to 5.0,
as we have similar symptoms there starting from 5.0.20.
[21 Jun 2006 8:15] Alexander Nozdrin
The problem has nothing to do with index_merge or any subsequent tests.
The problem is that after IM test failure, one mysqld instance stays alive.
Actually, this is a duplucate of BUG#18023.
[31 Jul 2006 15:54] Joerg Bruehe
Even though the fixes for bug#18023 are in the sources, the problem still occurs:
Some "im_*" tests fail, and "index_merge*" + "information_schema*" suffer.

Details:

1) One (or more) of the "im_*" tests fails,
   in 5.0.24 these are
im_daemon_life_cycle   :: bug#15934 bug#18020 bug#21331
im_life_cycle          :: bug#15934 bug#18020 bug#21333 bug#21364 bug#21365
im_options_set         :: bug#15934 bug#18020
im_options_unset       :: bug#15934 bug#18020 bug#18027 bug#21331
im_utils               :: bug#15934 bug#18020 bug#18033 bug#21366

2) All following "index_merge*" and "information_schema*" tests will fail like this:
=====
index_merge                    [ fail ]

Errors are (from /PATH/mysqltest-time) :
mysqltest: Could not open connection 'default': 2002 Can't connect to local MySQL server through socket '/PATH/master.sock' (2)
(the last lines may be the most important ones)

Killing Possible Leftover Processes
=====
Same symptom for all these tests:
index_merge
index_merge_bdb
index_merge_innodb2
index_merge_innodb
index_merge_ror
index_merge_ror_cpk
information_schema
information_schema_chmod
information_schema_db
information_schema_inno

3) The next test is "init_connect", it passes.

Note that if the "im_*" test passes or is skipped, then the other tests will pass.
This is visible not only from the "PS" test run passing (where IM tests are skipped), but very clearly from some "max" and "cluster" builds ("ita2", "pegasos3", "rhas3-x86", "sol9x86") in the 5.0.24 status page:
When the suite is run both with and without NDB (using the same binary) and in only one of these runs no IM test fails, then the following tests will also pass.

Suggestion: 
If the IM test failures are so hard to fix (races, ...),
ensure that the cleanup effect of "init_connect" is done immediately after the "im_*" tests,
so that even an IM problem does not block the "index_merge*" and "information_schema*" from being taken.
It reduces test coverage!
[18 Aug 2006 13:41] Alexander Nozdrin
The patch is in the main 5.0 tree, currently tagged 5.0.25.