Bug #31785 mysql.server stop does not spot server gone away
Submitted: 23 Oct 2007 11:22 Modified: 6 Aug 2009 23:49
Reporter: Axel Schwenke Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.0.52, 5.1.23 OS:Any
Assigned to: Daniel Fischer CPU Architecture:Any

[23 Oct 2007 11:22] Axel Schwenke
Description:
When the MySQL server was killed without the pid file being removed then trying to stop the server with 'mysql.server stop' will wait 900 seconds before giving up.

This regularly bites users using a DRBD/Heartbeat cluster. Usually they test the cluster by manually killing mysqld_safe und mysqld with SIGKILL. When Heartbeat tries to shutdown the 'mysql' resource it hangs for a long time and eventually kills the mysql resource script after a timeout.

How to repeat:
Start MySQL using the mysql.server script:

~ $/usr/local/mysql/current/share/mysql/mysql.server start
Starting MySQL                                                       done

Now manually kill mysqld_safe and mysqld with kill -9 and then try to stop the MySQL server:

~ $/usr/local/mysql/current/share/mysql/mysql.server status
MySQL running (12709)                                                done
~ $kill -9 12680
~ $kill -9 12709
~ $/usr/local/mysql/current/share/mysql/mysql.server stop
Shutting down MySQL/usr/local/mysql/current/share/mysql/mysql.server: line 336: kill: (12709) - No such process
..........
(interrupted with CTRL+C)

BTW, the 'status' method of mysql.server diagnoses right:

~ $/usr/local/mysql/current/share/mysql/mysql.server status
MySQL is not running, but PID file exists                            failed

Suggested fix:
The mysql.server script should first check if a process with the pid from the pid file exists before it tries to stop it:

--- mysql.orig  2007-10-23 12:30:25.000000000 +0200
+++ mysql       2007-10-23 12:43:58.000000000 +0200
@@ -332,10 +332,17 @@
     if test -s "$pid_file"
     then
       mysqlmanager_pid=`cat $pid_file`
-      echo $echo_n "Shutting down MySQL"
-      kill $mysqlmanager_pid
-      # mysqlmanager should remove the pid_file when it exits, so wait for it.
-      wait_for_pid removed; return_value=$?
+
+      if (kill -0 $mysqlmanager_pid 2>/dev/null)
+      then
+        echo $echo_n "Shutting down MySQL"
+        kill $mysqlmanager_pid
+        # mysqlmanager should remove the pid_file when it exits, so wait for it.
+        wait_for_pid removed; return_value=$?
+      else
+        log_failure_msg "MySQL manager or server process #$mysqlmanager_pid is not running!"
+        rm $pid_file
+      fi
 
       # delete lock for RedHat / SuSE
       if test -f $lock_dir

after applying this patch:

~ $/usr/local/mysql/current/share/mysql/mysql.server stop
MySQL manager or server process #12709 is not running!               failed
[1 Feb 2008 17:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/41576

ChangeSet@1.2574, 2008-02-01 18:39:50+01:00, df@pippilotta.erinye.com +1 -0
  BUG#31785 mysql.server stop does not spot server gone away
[2 Jul 2009 13:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/77770

2952 Daniel Fischer	2009-07-02
      merge patch for bug#31785
[23 Jul 2009 10:24] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090723102221-ps4uaphwbxzj8p0q) (version source revid:joerg@mysql.com-20090721145751-rqqnhv0kage18wfi) (merge vers: 5.4.4-alpha) (pib:11)
[30 Jul 2009 13:14] Daniel Fischer
Queued in *-build.
[4 Aug 2009 22:03] Davi Arnaut
Pushed into 5.1.38
[6 Aug 2009 23:49] Paul DuBois
Noted in 5.1.38, 5.4.4 changelogs.

If the MySQL server was killed without the PID file being removed,
attempts to stop the server with mysql.server stop waited 900 seconds
before giving up.
[12 Aug 2009 23:00] Paul DuBois
Noted in 5.4.2 changelog because next 5.4 version will be 5.4.2 and not 5.4.4.
[15 Aug 2009 2:18] Paul DuBois
Ignore previous comment about 5.4.2.
[1 Oct 2009 5:58] Bugs System
Pushed into 5.1.39-ndb-6.3.28 (revid:jonas@mysql.com-20091001055605-ap2kiaarr7p40mmv) (version source revid:jonas@mysql.com-20091001055605-ap2kiaarr7p40mmv) (merge vers: 5.1.39-ndb-6.3.28) (pib:11)
[1 Oct 2009 7:25] Bugs System
Pushed into 5.1.39-ndb-7.0.9 (revid:jonas@mysql.com-20091001072547-kv17uu06hfjhgjay) (version source revid:jonas@mysql.com-20091001071652-irejtnumzbpsbgk2) (merge vers: 5.1.39-ndb-7.0.9) (pib:11)
[1 Oct 2009 13:25] Bugs System
Pushed into 5.1.39-ndb-7.1.0 (revid:jonas@mysql.com-20091001123013-g9ob2tsyctpw6zs0) (version source revid:jonas@mysql.com-20091001123013-g9ob2tsyctpw6zs0) (merge vers: 5.1.39-ndb-7.1.0) (pib:11)
[5 Oct 2009 10:50] Bugs System
Pushed into 5.1.39-ndb-6.2.19 (revid:jonas@mysql.com-20091005103850-dwij2dojwpvf5hi6) (version source revid:jonas@mysql.com-20090930185117-bhud4ek1y0hsj1nv) (merge vers: 5.1.39-ndb-6.2.19) (pib:11)
[9 Oct 2009 1:29] Paul DuBois
The 5.4 fix has been pushed to 5.4.2.