| Bug #11122 | Server won't always start when cold booting after a crash | ||
|---|---|---|---|
| Submitted: | 7 Jun 2005 0:22 | Modified: | 17 Oct 2008 17:04 |
| Reporter: | David Zafman | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server | Severity: | S2 (Serious) |
| Version: | 4.1.9, 5.0.60 | OS: | Linux (Linux FC3) |
| Assigned to: | Chad MILLER | CPU Architecture: | Any |
[8 Jun 2005 5:29]
Jorge del Conde
Checked the 4.1bk code. Patch makes sense
[9 Jun 2005 1:58]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/25792
[10 Jun 2005 4:57]
Jim Winstead
Fixed in 4.1.13 and 5.0.8.
[28 Jun 2005 14:29]
Jon Stephens
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.
If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information
about accessing the source trees is available at
http://www.mysql.com/doc/en/Installing_source_tree.html
Additional info:
Documented in 4.1.13 and 5.0.8 change history; closed.
[4 Jul 2008 4:46]
Sean Pringle
Hi
This overall problem is not quite fixed it seems:
If mysqld_safe itself (instead of the grep) gets the same PID as the content of a stale mysqld PID file, it aborts in the same place with the same "A mysqld process already exists" error.
Current code:
#
# If there exists an old pid file, check if the daemon is already running
# Note: The switches to 'ps' may depend on your operating system
if test -f $pid_file
then
PID=`cat $pid_file`
if /bin/kill -0 $PID > /dev/null 2> /dev/null
then
if /bin/ps p $PID | grep -v grep | grep $MYSQLD > /dev/null
then # The pid contains a mysqld process
echo "A mysqld process already exists"
echo "A mysqld process already exists at " `date` >> $err_log
exit 1
fi
fi
rm -f $pid_file
...
Above, $MYSQLD can default to just 'mysqld'. An additional 'grep -v mysqld_safe' fixes the problem as before.
This is hard to reproduce, but can be simulated by adding the following line to mysqld_safe just before the above code block:
echo -n $$ >$pid_file # simulate stale pid file
Presumably mysqld_safe getting the same PID as the previous crashed mysqld is unlikely, but if the system reboots and MySQL is configured to start at boot time, the general PID range will be similar each time, increasing the chances.
[5 Aug 2008 15:45]
Chad MILLER
The real bug is mysql doesn't put its pid files in a location that is automatically cleaned by the system at boot. Correcting that may be hard, but we should seriously consider it.
[5 Aug 2008 16:12]
Chad MILLER
Gosh, so much will break if the path or any mysqld arguments contain the contiguous letters 'g', 'r', 'e', and 'p'. :\
[5 Aug 2008 18:01]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/50949 2653 Chad MILLER 2008-08-05 Bug#11122: Server won't always start when cold-booting after a crash The grep expression that finds a running "mysqld" program fails if the "mysqld_safe" is running with the same PID. Now, match "msyqld" at the end of a line or before a space only. This also has the effect of the matcher expression never matching itself, as the metacharacters don't describe themselves. Additionally, some text to search could be truncated if very long.
[2 Oct 2008 16:29]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/55119 2653 Chad MILLER 2008-10-02 Bug#11122: Server won't always start when cold-booting after a crash The grep expression that finds a running "mysqld" program fails if the "mysqld_safe" is running with the same PID. Now, excise "ps" output that has the word " grep" or "mysqld_safe" in it, to be a little more certain that the matched process is not a false positive hit. This will fail when the path to mysqld contains either of those two names, which should be acceptable. Additionally, some text to search could be truncated if very long. Expand the number of lines "ps" emits.
[9 Oct 2008 17:26]
Bugs System
Pushed into 5.0.72 (revid:chad@mysql.com-20081002162552-cw77j2cpzw23qycy) (version source revid:chad@mysql.com-20081006135227-u2s7w953ysaqjhda) (pib:4)
[9 Oct 2008 17:35]
Bugs System
Pushed into 5.1.30 (revid:chad@mysql.com-20081002162552-cw77j2cpzw23qycy) (version source revid:mats@sun.com-20081008113713-2vxny72m5w1tywoi) (pib:4)
[14 Oct 2008 18:20]
Paul DuBois
Noted in 5.0.72, 5.1.30 changelogs. Resetting report to NDI pending push into 6.0.x.
[15 Oct 2008 14:54]
Paul DuBois
Correction: Noted in 5.1.29 changelog, not 5.1.30.
[17 Oct 2008 16:46]
Bugs System
Pushed into 6.0.8-alpha (revid:chad@mysql.com-20081002162552-cw77j2cpzw23qycy) (version source revid:chad@mysql.com-20081006135653-hlwefkm6dvvqm3z6) (pib:5)
[17 Oct 2008 17:04]
Paul DuBois
Noted in 6.0.8 changelog.
[28 Oct 2008 21:05]
Bugs System
Pushed into 5.1.29-ndb-6.2.17 (revid:chad@mysql.com-20081002162552-cw77j2cpzw23qycy) (version source revid:tomas.ulin@sun.com-20081028140209-u4emkk1xphi5tkfb) (pib:5)
[28 Oct 2008 22:24]
Bugs System
Pushed into 5.1.29-ndb-6.3.19 (revid:chad@mysql.com-20081002162552-cw77j2cpzw23qycy) (version source revid:tomas.ulin@sun.com-20081028194045-0353yg8cvd2c7dd1) (pib:5)
[1 Nov 2008 9:48]
Bugs System
Pushed into 5.1.29-ndb-6.4.0 (revid:chad@mysql.com-20081002162552-cw77j2cpzw23qycy) (version source revid:jonas@mysql.com-20081101082305-qx5a1bj0z7i8ueys) (pib:5)

Description: Normal system services put there pid files in /var/run or a directory under /var/run. The /etc/rc.d/rc.sysinit script removes all pid files on a cold boot of a system. But mysql doesn't put its pid file there, nor does it provide a mechanism to remove it on a cold boot as far as I can tell. So, it is possible for the database to not start-up during a machine cold boot. The following message was seen in my /var/log/mysql/mysqld.err immediately following a system crash: A mysqld process already exists at Sun Jun 5 22:10:32 PDT 2005 The problem is that if the pid of the "grep mysqld" happens to match the pid of the mysql that was running before the crash the following code will believe that mysql is already running. # # If there exists an old pid file, check if the daemon is already running # Note: The switches to 'ps' may depend on your operating system if test -f $pid_file then PID=`cat $pid_file` if /bin/kill -0 $PID > /dev/null 2> /dev/null then if /bin/ps p $PID | grep mysqld > /dev/null then # The pid contains a mysqld process echo "A mysqld process already exists" echo "A mysqld process already exists at " `date` >> $err_log exit 1 fi fi How to repeat: Crash your machine in a loop until the database doesn't come up. I'm assuming that mysql service is enabled for the default run-level. I was doing "ssh root@machine reboot -f -n" to simulate a crash. In between check if the database has started or not. Suggested fix: A kludgy fix would be to make sure that "grep" is not part of the command string, so that the grep doesn't find itself. I really think that all mysql pid files should be in /var/run, so that normal system mechanism can properly clean them on cold boot. --- /usr/bin/mysqld_safe 2005-01-12 19:35:05.000000000 -0800 +++ /usr/bin/mysqld_safe.new 2005-06-06 17:17:50.092664000 -0700 @@ -261,7 +261,7 @@ PID=`cat $pid_file` if /bin/kill -0 $PID > /dev/null 2> /dev/null then - if /bin/ps p $PID | grep mysqld > /dev/null + if /bin/ps p $PID | grep mysqld | grep -v grep > /dev/null then # The pid contains a mysqld process echo "A mysqld process already exists" echo "A mysqld process already exists at " `date` >> $err_log