Bug #17486 IM: race condition on exit
Submitted: 16 Feb 2006 17:10 Modified: 7 Nov 2006 18:59
Reporter: Lenz Grimmer Email Updates:
Status: Closed Impact on me:
None 
Category:Instance Manager Severity:S3 (Non-critical)
Version:5.1.6 OS:Linux (SUSE Linux 10.0)
Assigned to: Alexander Nozdrin CPU Architecture:Any

[16 Feb 2006 17:10] Lenz Grimmer
Description:
The IM is not capable of managing the default instance that it controls when I don't set up any special instance settings. I tried to set up a minimal IM configuration by just creating the following /etc/my.cnf file:

  [manager]
  socket=/var/lib/mysql/manager.sock

I also created a /etc/mysqlmanager.passwd file (the default location) and added one "admin" user.

This allowed me to fire up the IM and the default mysqld instance just fine, I can connect to both the server and the IM just fine. However, the instance is being displayed as being "offline" and the IM seems to crash when I try to stop it:

lenz@metis:~> mysqladmin version
mysqladmin  Ver 8.41 Distrib 5.1.6-alpha, for pc-linux-gnu on i686
Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL license

Server version          5.1.6-alpha
Protocol version        10
Connection              Localhost via UNIX socket
UNIX socket             /var/lib/mysql/mysql.sock
Uptime:                 40 min 25 sec

Threads: 1  Questions: 20  Slow queries: 0  Opens: 0  Flush tables: 1  Open tables: 8  Queries per second avg: 0.008
lenz@metis:~> mysql --socket=/var/lib/mysql/manager.sock -u admin -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 29 to server version: 0.2-alpha

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> SHOW INSTANCES;
+---------------+---------+
| instance_name | status  |
+---------------+---------+
| mysqld        | offline |
+---------------+---------+
1 row in set (0.01 sec)

mysql> STOP INSTANCE mysqld;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    30
Current database: *** NONE ***

Query OK, 0 rows affected (0.01 sec)

mysql> SHOW INSTANCES;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    31
Current database: *** NONE ***

+---------------+---------+
| instance_name | status  |
+---------------+---------+
| mysqld        | offline |
+---------------+---------+
1 row in set (0.00 sec)

It would be nice if it would be possible to manage the default instance without having to explicitely create an instance configuration in /etc/my.cnf for it. The IM should simply default to the default compile time values for the various server settings (e.g. PID file, socket file, log file path names, etc).

How to repeat:
Create a minimal IM configuration as outlined below and don't create a specific instance section in my.cnf, so the IM uses the default values to start up the default "mysqld" instance.
Check the status of this instance and try to manage it.

Suggested fix:
Make the IM to use the default server settings, if nothing else is provided.
[16 Feb 2006 17:24] Lenz Grimmer
The behaviour is not very deterministic - it acutally seems as if the IM succeeds in shutting down the instance once, but then looses track of it. Shutting down MySQL using "/etc/init.d/mysql stop" afterwards does not succeed anymore - the IM terminates, but leaves stray mysqld processes behind that are still active.
[11 Sep 2006 9:06] Alexander Nozdrin
1. Instance Manager has changed since initial report of this bug.
Now, it has "--mysqld-safe-compatible" option responsible for
managing default instance. So, this part of the bug report
is fixed.

2. However, there are still two problems:
  1. Instance Manager does not shutdown properly on SIGTERM:
     - There is stray mysqld instance;
     - It takes too much time for IM to stop. If IM is stopped from
       mysql.server script, the script does not detect that IM stopped
       and reports an error.
  2. It seems, there is race condition on exit by SIGQUIT -- IM stops
     (stops quickly and shuts down mysqld), but it seems that the main
     thread finishes before all other threads are stopped.
[17 Oct 2006 13:02] Alexander Nozdrin
Now the only problem is the second one: the main thread does not
wait for all others to stop before exiting.
[19 Oct 2006 9:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/13951

ChangeSet@1.2351, 2006-10-19 13:12:45+04:00, anozdrin@booka.aliknet +10 -0
  Fix for BUG#17486: IM: race condition on exit.
  
  IM uses wait()/waitpid() system calls to wait for started mysqld to stop.
  The problem was that return values of these functions were ignored.
  So, IM assumed that if wait() returns, mysqld is down. However, this can
  be not so, if wait() was interrupted by SIGINT/SIGTERM.
  
  The fix is to call wait()/waitpid() again if the previous call was interrupted.
[23 Oct 2006 11:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/14174

ChangeSet@1.2351, 2006-10-23 15:37:01+04:00, anozdrin@booka. +10 -0
  Fix for BUG#17486: IM: race condition on exit.
  
  The problem was that IM stoped guarded instances on shutdown,
  but didn't wait for them to stop.
  
  The fix is to wait for guarded instances to stop before exitting
  from the main thread.
  
  The idea is that Instance-monitoring thread should add itself
  to Thread_registry so that it will be taken into account on shutdown.
  However, Thread_registry should not signal it on shutdown in order to
  not interrupt wait()/waitpid().
[23 Oct 2006 12:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/14180

ChangeSet@1.2351, 2006-10-23 16:47:33+04:00, anozdrin@booka. +11 -0
  Fix for BUG#17486: IM: race condition on exit.
  
  The problem was that IM stoped guarded instances on shutdown,
  but didn't wait for them to stop.
  
  The fix is to wait for guarded instances to stop before exitting
  from the main thread.
  
  The idea is that Instance-monitoring thread should add itself
  to Thread_registry so that it will be taken into account on shutdown.
  However, Thread_registry should not signal it on shutdown in order to
  not interrupt wait()/waitpid().
  ---
  IM polishing: log more information in log.
[23 Oct 2006 12:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/14181

ChangeSet@1.2351, 2006-10-23 16:48:18+04:00, anozdrin@booka. +10 -0
  Fix for BUG#17486: IM: race condition on exit.
  
  The problem was that IM stoped guarded instances on shutdown,
  but didn't wait for them to stop.
  
  The fix is to wait for guarded instances to stop before exitting
  from the main thread.
  
  The idea is that Instance-monitoring thread should add itself
  to Thread_registry so that it will be taken into account on shutdown.
  However, Thread_registry should not signal it on shutdown in order to
  not interrupt wait()/waitpid().
  ---
  IM polishing: log more information in log.
[24 Oct 2006 14:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/14266

ChangeSet@1.2324, 2006-10-24 18:23:16+04:00, anozdrin@alik. +10 -0
  Fix for BUG#17486: IM: race condition on exit.
  
  The problem was that IM stoped guarded instances on shutdown,
  but didn't wait for them to stop.
  
  The fix is to wait for guarded instances to stop before exitting
  from the main thread.
  
  The idea is that Instance-monitoring thread should add itself
  to Thread_registry so that it will be taken into account on shutdown.
  However, Thread_registry should not signal it on shutdown in order to
  not interrupt wait()/waitpid().
[25 Oct 2006 9:41] Konstantin Osipov
Reviewed over email and irc.
[2 Nov 2006 14:32] Dmitry Lenev
Fixed in 5.1.13
[7 Nov 2006 18:59] Paul DuBois
Noted in 5.1.13 changelog.

At shutdown, Instance Manager told guarded server instances to
stop, but did not wait until they actually stopped.