Bug #37944 random tests fail in pb with message "failed to start mysqld"
Submitted: 7 Jul 2008 19:16 Modified: 19 Feb 2009 15:45
Reporter: Sven Sandberg Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Tests Severity:S7 (Test Cases)
Version:5.1+ OS:Any
Assigned to: Magnus Blåudd CPU Architecture:Any
Tags: 51rpl, failed to start mysqld, failed to start server, pushbuild, random, sporadic, test failure

[7 Jul 2008 19:16] Sven Sandberg
Description:
This is probably the most common error in pushbuild:

  failed to start mysqld.1

or

  failed to start mysqld.2

It is problematic since there is no way to reproduce or debug it. It fails quite often this way. It seems to be uniformly distributed over all tests and hosts. (Among the suites, it is overrepresented in ps_stm_threadpool, but that's probably a separate problem, related to BUG#36443.)

I recently managed to reproduce it while putting a heavy I/O load on my laptop. That suggests the problem may be that mysqld is slow, and mtr times out waiting for it to start. Indeed, mtr gives the error message if mysqld has not created a pid file within 30 seconds from when it's started.

How to repeat:
E.g., the last six months in ps_row: http://tinyurl.com/5wcffj
There were 17 failures per day on average. And that's just one suite.

Suggested fix:
I suggest we increase the timeout drastically, maybe to 5 minutes or so. That will remove most spurious failures. That would have the following consequences:

- For cases when the server is slow, we avoid spurious failures.

- For cases when the server has crashed, it makes the host sleep for longer time.

- For cases when the server has crashed, we get much higher confidence that it really is a server crash and not a spurious timeout due to high load on the host.
[7 Jul 2008 19:21] Sven Sandberg
My mysql-test/var directory after rpl_flushlog_loop failed

Attachment: var.tar.bz2 (application/x-bzip, text), 264.47 KiB.

[7 Jul 2008 19:22] Sven Sandberg
The above attachment contains the mysql-test/var directory, saved right after I triggered the error with a high workload on my laptop. The workload was the C program associated with BUG#36618.
[5 Sep 2008 12:50] Magnus Blåudd
The startup timeout has been increased and the mysqld's error log will be displayed.
[19 Feb 2009 15:45] Paul DuBois
Test suite changes. No changelog entry needed.