Bug #46327 MTR2 prevents gcov data accumulation
Submitted: 21 Jul 2009 18:37 Modified: 19 Dec 2009 0:08
Reporter: Ingo Strüwing Email Updates:
Status: Closed Impact on me:
None 
Category:Tools: MTR / mysql-test-run Severity:S7 (Test Cases)
Version:5.4, 5.1 OS:Linux
Assigned to: Bjørn Munch CPU Architecture:Any

[21 Jul 2009 18:37] Ingo Strüwing
Description:
Running a list of test cases in MTR2 does not always update the gcov counters. This makes it impossible to do coverage testing on a developer's machine.

Running the whole test suite behaves better, but sometimes some lines of code are still not counted.

After fixing MTR v1 to be able to run the tests (--language, --debug-sync-timeout), I found that MTR v1 behaves correctly. It is a reliable tool for coverage testing.

How to repeat:
Branch 5.4.
Do a gcov build.
Change a few lines in the code.
Run a test case that covers the changed lines.
Run dgcov --uncommitted

Suggested fix:
If MTR2 cannot be made to reliably update the gcov counters, please keep MTR1 up to date.
[21 Jul 2009 18:43] Ingo Strüwing
Suggested triage values:
Defect: Serious. Makes gcov testing impossible.
Workaround: partial. One can patch and use MTR1.
Impact: substantial. If all developers would do gcov testing...
[21 Jul 2009 18:44] Ingo Strüwing
Patch can be used to make MTR1 ready for use as a workaround.

Attachment: MTR1-6.0-1.diff (text/x-diff), 3.27 KiB.

[2 Sep 2009 20:16] Ingo Strüwing
On a current mysql-6.0-backup I made some experiments.
I have a Ubuntu 9.04 64-bit on quadcore Intel processor.
I tried to run MTR1 and MTR2 in different costallations.
Note that I do *not* use the --gcov command line option of MTR.
Instead, I purge gcov statistics with ~/internals/dev/dgcov/dgcov.pl -p.
I analyze the statistics with:
    mysql-test/mysql-test-lcov.pl
    firefox file://`pwd`/mysql-test/mysql-test-lcov/index.html
As coverage numbers (see below) I use the number behind "sql/backup".
(I need to patch mysql-test/mysql-test-lcov.pl because it fails on
 sql_yacc.yy on my system, but others didn't report such problem.
 I can add the patch, if required though.)
(Note that GCOV_PREFIX= prevents a crash in gcov_exit() in mysqltest.)

Now the tests:

1. GCOV_PREFIX= MTR_VERSION=1 ./mysql-test-run.pl --force --suite=backup,backup_engines,backup_ptr --mysqld=--mysql-backup
   Many test fail, but still:
   ~75% coverage

2. GCOV_PREFIX= MTR_PARALLEL=1 ./mysql-test-run.pl --force --suite=backup,backup_engines,backup_ptr
   All 99 tests were successful.
   ~10% coverage

3. GCOV_PREFIX= MTR_PARALLEL=1 ./mysql-test-run.pl --force
   Took 4 hours, All 1264 tests were successful.
   ~50% coverage

4. GCOV_PREFIX= MTR_PARALLEL=4 ./mysql-test-run.pl --force --mem
   Took 1 hour, Failed 1/1264 tests (rpl.rpl_killed_ddl)
   ~50% coverage
[3 Sep 2009 6:52] Philip Stoev
Bjorn, this is a 100% real problem that we need to resolve and can not live with. Check out the PB2 coverage reports in the clustra ~bteam directory. One day, mysql-next will have 45% code coverage and mysql-next-bugfixing will have 55% . The next day, it is going to be the other way around. 

MTRv1 is also unable to shut down the server cleanly and uses kill -9:

Stopping All Servers
mtr_ping_with_timeout(): At least one server is alive.
mysql-test-run: WARNING: Forcing kill of process 8735

So, one potential hack is to do it like this:

perl lib/v1/mysql-test-run.pl --start-and-exit 1st
perl lib/v1/mysql-test-run.pl --extern select --user=root --socket=var/tmp/master.sock
../client/mysqladmin --socket=var/tmp/master.sock --verbose -uroot shutdown
[3 Sep 2009 7:50] Bjørn Munch
I didn't get any further on this because I couldn't figure out how to observe any problems, or alternatively, it didn't fail for me.  Thanks for the additional clrarifications, I will pick this up again soon.
[3 Sep 2009 9:27] Kristian Nielsen
My guess (without looking deeply) is that this is caused by the way MTR2 tends
to stop mysqld with kill -9.

As I remember, there are several situations where kill -9 is used to stop the
mysqld server. This should be removed, as it causes several problems. One of
them that is relevant here is that this prevents gcov from writing out the
coverage data at exit, which would explain missing coverage data.

(Another problem is that it prevents memory leak checking with Valgrind).

I believe there are already some fixes along these lines in the MariaDB tree,
but would need to check in detail to see if they fix this particular problem.
[10 Sep 2009 11:23] Bjørn Munch
bzr commit is currently broken for me, so I quote the one-line diff which I hope solves the problem.  You want to check it out in your environment:

--------------
=== modified file 'mysql-test/mysql-test-run.pl'
--- mysql-test/mysql-test-run.pl        2009-09-02 21:29:11 +0000
+++ mysql-test/mysql-test-run.pl        2009-09-10 11:03:39 +0000
@@ -737,6 +737,7 @@
     }
     elsif ($line eq 'BYE'){
       mtr_report("Server said BYE");
+      stop_all_servers();
       exit(0);
     }
     else {

--------------
[10 Sep 2009 11:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/82927

2836 Bjorn Munch	2009-09-10
      Bug #46327 MTR2 prevents gcov data accumulation
      Call stop_all_servers() before doing exit
[10 Sep 2009 14:10] Magnus Blåudd
Looks ok, that way the worker child should try to shutdown any  servers still alive, tehn wait for --shutdown-timeout and if that expires kill them with -9.

Good find Bjorn.
[10 Sep 2009 18:36] Ingo Strüwing
I agree with the patch, but unfortunately it does not help. :-(
I get the same results as above.
[12 Sep 2009 13:12] Philip Stoev
From working with the RQG, I have noticed that one needs to actually wait for mysqld to stop. Issuing a shutdown command does not mean that the process has exited. The process can continue to run for a while, which in turn may cause it to be improperly killed by some other script, or the processing of gcov data may start prematurely.
[14 Sep 2009 17:39] Patrick Crews
In case this helps, invoking mysql-test-run as follows will allow proper capture of gcov data:

perl lib/v1/mysql-test-run.pl --start-and-exit 1st ; \
      perl lib/v1/mysql-test-run.pl --extern select --user=root --socket=var/tmp/master.sock ; \
      ../client/mysqladmin --socket=var/tmp/master.sock --verbose -uroot shutdown

You can use sql/sql_parse.cc's dispatch_command() counts to tell if gcov data is properly captured (should be 2359 for select.test).
[30 Sep 2009 13:37] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85211

2837 Bjorn Munch	2009-09-30
      Bug #46327 MTR2 prevents gcov data accumulation
      mysqladmin fails on Linux in 6.0 without --character-sets-dir
      Also added timeout for server shutdown, hope this will solve it
[2 Oct 2009 19:09] Ingo Strüwing
Have you ever heared of McMurphy? Some days ago I deleted my old tree with which I tested the gcov issue. There wasn't progress for some time. Now I branched a new tree, and the problem is not repeatable any more. :(

Hence I cannot test, if your patch makes a difference.

Nevertheless the changes look ok. I vote for push. You may then close the bug for now. If the problem reappears, we can talk about reopening it, or creating a new one.
[6 Oct 2009 8:36] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85838

2839 Bjorn Munch	2009-10-06
      Bug #46327 MTR2 prevents gcov data accumulation
      mysqladmin fails on Linux in 6.0 without --character-sets-dir
      Also added timeout for server shutdown, hope this will solve it
[6 Oct 2009 13:19] Bjørn Munch
Pused to 5.1-mtr, trunk-mtr, next-mr-mtr, 6.0-codebase-mtr
[22 Oct 2009 20:17] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091022201524-0efl2fzebfuuf0vk) (version source revid:bjorn.munch@sun.com-20091006090233-lcu28yngy9i2sy9k) (merge vers: 6.0.14-alpha) (pib:13)
[22 Oct 2009 20:18] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091022201318-jfvtrzd6lb07cwp5) (version source revid:bjorn.munch@sun.com-20091006085829-j8d20ow1ywmv8rgx) (merge vers: 5.4.5-beta) (pib:13)
[22 Oct 2009 23:15] Paul DuBois
Test suite change. No changelog entry needed.

Setting report to NDI pending push into 5.1.x.
[23 Oct 2009 7:33] Bugs System
Pushed into 5.1.41 (revid:bjorn.munch@sun.com-20091021073307-ummbh6668hvfxqjv) (version source revid:bjorn.munch@sun.com-20091021073307-ummbh6668hvfxqjv) (merge vers: 5.1.41) (pib:13)
[23 Oct 2009 15:11] Paul DuBois
Test suite change. No changelog entry needed.
[18 Dec 2009 10:27] Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6) (version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers: 5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 10:43] Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt) (version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers: 5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 10:59] Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0) (version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers: 5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 11:13] Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606) (version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers: 5.1.41-ndb-7.0.11) (pib:15)