Bug #28742 mysql-test-run is very slow on "Stopping All Servers" step
Submitted: 29 May 2007 10:30 Modified: 22 Jun 2007 18:42
Reporter: Alexander Barkov Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Tests Severity:S3 (Non-critical)
Version:5.1-BK OS:Linux (Fedora Core 6)
Assigned to: Magnus Blåudd CPU Architecture:Any

[29 May 2007 10:30] Alexander Barkov
Description:
"mysql-test-run" spends about 15 seconds on the "Stopping All Servers" step.

It's very inconvenience when debugging a single test.

How to repeat:
Run:

time ./mysql-test-run testname

For example,

$time ./mysql-test-run ctype_ujis
Logging: ./mysql-test-run ctype_ujis
MySQL Version 5.1.18
Using binlog format 'mixed'
Using ndbcluster when necessary, mysqld supports it
Setting mysqld to support SSL connections
Binaries are debug compiled
Using MTR_BUILD_THREAD      = 0
Using MASTER_MYPORT         = 9306
Using MASTER_MYPORT1        = 9307
Using SLAVE_MYPORT          = 9308
Using SLAVE_MYPORT1         = 9309
Using SLAVE_MYPORT2         = 9310
Using NDBCLUSTER_PORT       = 9310
Using NDBCLUSTER_PORT_SLAVE = 9311
Using IM_PORT               = 9312
Using IM_MYSQLD1_PORT       = 9313
Using IM_MYSQLD2_PORT       = 9314
Killing Possible Leftover Processes
Removing Stale Files
Creating Directories
Installing Master Database
Installing Master Database
=======================================================
Starting Tests in the 'main' suite

TEST                           RESULT         TIME (ms)
-------------------------------------------------------

ctype_ujis                     [ pass ]            344
-------------------------------------------------------
Stopping All Servers
All 1 tests were successful.
The servers where restarted 1 times
Spent 0.344 seconds actually executing testcases

real    0m14.928s
user    0m0.228s
sys     0m0.208s

As it's seen from this output, it spend only 0.344 seconds
for the actual tests, and about 14 seconds for extra activity.
Most of the extra time is spent on  "Stopping All Servers" step.

Suggested fix:
Make "Stopping All Servers" step faster :)
[29 May 2007 11:14] Valeriy Kravchuk
Thank you for a bug report. Verified just as described with latest 5.1.20-BK on Linux:

openxs@suse:~/dbs/5.1/mysql-test> time ./mysql-test-run.pl ctype_ujis
Logging: ./mysql-test-run.pl ctype_ujis
070520 20:45:17 [Warning] Server variable data_file_path of plugin InnoDB was fo
rced to be read-only: string variable without update_func and PLUGIN_VAR_MEMALLO
C flag
070520 20:45:17 [Warning] Server variable data_home_dir of plugin InnoDB was for
ced to be read-only: string variable without update_func and PLUGIN_VAR_MEMALLOC
 flag
070520 20:45:17 [Warning] Server variable flush_method of plugin InnoDB was forc
ed to be read-only: string variable without update_func and PLUGIN_VAR_MEMALLOC
flag
070520 20:45:17 [Warning] Server variable log_arch_dir of plugin InnoDB was forc
ed to be read-only: string variable without update_func and PLUGIN_VAR_MEMALLOC
flag
070520 20:45:17 [Warning] Server variable log_group_home_dir of plugin InnoDB wa
s forced to be read-only: string variable without update_func and PLUGIN_VAR_MEM
ALLOC flag
070520 20:45:17 [Note] /home/openxs/dbs/5.1/libexec/mysqld: Shutdown complete

MySQL Version 5.1.20
Using binlog format 'mixed'
Using ndbcluster when necessary, mysqld supports it
Skipping SSL, mysqld not compiled with SSL
Binaries are debug compiled
mysql-test-run: WARNING: Could not find all required ndb binaries, all ndb tests
 will fail, use --skip-ndbcluster to skip testing it.
Using MTR_BUILD_THREAD      = 0
Using MASTER_MYPORT         = 9306
Using MASTER_MYPORT1        = 9307
Using SLAVE_MYPORT          = 9308
Using SLAVE_MYPORT1         = 9309
Using SLAVE_MYPORT2         = 9310
Using NDBCLUSTER_PORT       = 9310
Using NDBCLUSTER_PORT_SLAVE = 9311
Using IM_PORT               = 9312
Using IM_MYSQLD1_PORT       = 9313
Using IM_MYSQLD2_PORT       = 9314
Killing Possible Leftover Processes
Removing Stale Files
Creating Directories
Installing Master Database
Installing Master Database
=======================================================
Starting Tests in the 'main' suite

TEST                           RESULT         TIME (ms)
-------------------------------------------------------

ctype_ujis                     [ pass ]           2047
-------------------------------------------------------
Stopping All Servers
All 1 tests were successful.
The servers were restarted 1 times
Spent 2.047 seconds actually executing testcases

real    0m20.489s
user    0m0.388s
sys     0m1.655s

So, I had 18 seconds overhead...
[14 Jun 2007 8:53] Magnus Blåudd
Although I can agree that the shutdown is slow this actually indicates a problem with shutdown of mysqld. mysql-test-run.pl first uses "mysqladmin shutdown" and then waits for the mysqld to close it's socket. At that time the mysqld should have shutdown properly and "waitpid" will return the pid of the mysqld. In about 30% the waitpid does not return the mysqlds pid -> it hasn't exited yet. A "kill with signal loop"  will start and first use SIGTERM 10 times, before starting with signal SIGKILL which will kill it. Although it's actually overkill to do kill(SIGTERM) 10 times the trace file shows that the signal handler thread of mysqld hangs in "calling my_thread_end" and thus the mysqld does not return.

T@5    : | quit: Error messages freed
T@5    : | quit: got thread count lock
T@5    : | quit: done with cleanup
T@2    : | quit: signal_handler: calling my_thread_end()

Suggest changing the "kill with signal loop" to be more hard and resort to SIGKILL immediately(since mysqladmin has been used to shutdown gracefully a SIGTERm will be ignored anyway), but also print a warning about this so the problem is noticed.
[20 Jun 2007 9:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29169

ChangeSet@1.2531, 2007-06-20 11:47:55+02:00, msvensson@pilot.(none) +3 -0
  Bug#28742 mysql-test-run is very slow on "Stopping All Servers" step
   - Improve shutdown algorithm 
   - Wait up to 5 seconds for processes to exit after their port is free
[22 Jun 2007 18:07] Bugs System
Pushed into 5.1.20-beta
[22 Jun 2007 18:09] Bugs System
Pushed into 5.0.46
[22 Jun 2007 18:42] Paul DuBois
Test suite change. No changelog entry needed.