Bug #5001 | mysqld_safe kills all mysqlds when only one dies (under multiple installations) | ||
---|---|---|---|
Submitted: | 11 Aug 2004 20:20 | Modified: | 2 Sep 2004 0:19 |
Reporter: | Jason Sydes | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S2 (Serious) |
Version: | 4.0.20 | OS: | Linux (linux) |
Assigned to: | Bugs System | CPU Architecture: | Any |
[11 Aug 2004 20:20]
Jason Sydes
[12 Aug 2004 3:39]
Matthew Lord
This is still the case in 4.1.3. Maybe the safest way would be to store the socket file and use fuser?
[12 Aug 2004 9:25]
Brian Aker
fuser is not portable.
[13 Aug 2004 18:53]
Jason Sydes
But doesn't the pid file always appear on the command line? Isn't that the most portable way to ensure that you're killing the right processes? I suppose you could accidentally kill another process, say `less /tmp/mysql6.pid', but that seems a relatively harmless consequence....
[14 Aug 2004 22:47]
Sergei Golubchik
fuser is portable enough, the killing code is protected by @IS_LINUX@ anyway. at least all our linux hosts (and many non-linux ones, *bsd excluded) have fuser.
[17 Aug 2004 23:44]
Patrick Galbraith
I cannot get the failure to occur. I have started 4 servers via mysqld_multi, killed one of the servers yet all the rest of the remaining servers continue to run.
[18 Aug 2004 0:07]
Jason Sydes
Really? What is the "ps" line look like inside your mysqld_safe? Actually, when you kill one of the mysqld processes, does that server restart? Or does it just keep running? When I killed mine, it would restart that entire server, but, also, it would kill all the mysql processes, including those of the other servers. Are you seeing the following line in the error log of the "killed" server: "Number of processes now running: blah" with "blah" replaced by some number (probably around 10 or 20 in your test case, though for me it was generally several hundred). You might see that line multiple times too. That line is spit out by mysqld_safe. If you're not seeing that line, you're not going to see the same behavior I saw. If you are seeing that line, are you seeing that line in the other servers' error logs as well? If so, then that's the bug that I'm reporting. If you are seeing that line, and you're only seeing it the error log of the "killed" server, and you're **not** seeing it in the error logs of the other servers, then I am truly perplexed (I'll explain myself after hearing what you're seeing). When you do a `SHOW STATUS' or `mysqladmin stat' on the "killed" server, does that show you that it had been restarted (ie does the uptime show a low number)? How about for the other servers? Thanks for your attention to this Patrick! Jason
[18 Aug 2004 0:59]
Patrick Galbraith
Jason: when I kill one process of a running server ( 1 out of 3 servers), I see this in its log: 040817 15:47:50 mysqld restarted 040817 15:47:50 Warning: Asked for 196608 thread stack, but got 126976 040817 15:47:50 Can't start server: Bind on TCP/IP port: Address already in use 040817 15:47:50 Do you already have another mysqld server running on port: 3308 ? 040817 15:47:50 Aborting 040817 15:47:50 /usr/local/mysql/bin/mysqld: Shutdown Complete 040817 15:47:50 mysqld ended As far as processes, the other servers don't get touched. The shutdown server remains unstarted, and the others stay running, same exact PIDs before and after. I would suggest starting your servers with mysqld_multi and using mysqld_multi settings in my.cnf. Hope this helps;)
[22 Aug 2004 22:16]
Sergei Golubchik
Patrick, the reported bug is in *mysqld_safe* Thus to repeat it you need to use mysqld_safe as in mysqld_multi --example or with --mysqld=mysqld_safe. And then it's perfectly repeatable
[25 Aug 2004 22:30]
Patrick Galbraith
It works as intended, even with mysqld=/path/to/safe_mysqld (both 4.0.20 and 4.1.3): rsna:/usr/local # /usr/local/mysql/bin/mysqld_multi --user=mysql --mysqld=/usr/local/mysql/bin/mysqld_safe start krsna:/usr/local # ps aux|grep msyql root 6053 0.0 0.1 1764 568 pts/6 R 13:15 0:00 grep msyql krsna:/usr/local # ps aux|grep mysql root 25261 0.0 0.0 2092 4 pts/5 S Aug23 0:00 mysql mysql root 5955 0.2 0.2 2676 1200 pts/6 S 13:15 0:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --socket=/tmp/mysql.sock2 --port=3307 --pid-file=/usr/local/mysql/data2/hostname.pid2 --datadir=/usr/local/mysql/data2 root 5958 0.3 0.2 2676 1200 pts/6 S 13:15 0:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --socket=/tmp/mysql.sock3 --port=3308 --pid-file=/usr/local/mysql/data3/hostname.pid3 --datadir=/usr/local/mysql/data3 root 5961 0.3 0.2 2676 1200 pts/6 S 13:15 0:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --socket=/tmp/mysql.sock4 --port=3309 --pid-file=/usr/local/mysql/data4/hostname.pid4 --datadir=/usr/local/mysql/data4 mysql 6032 1.4 2.0 30304 10548 pts/6 D 13:15 0:00 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data4 --user=mysql --log=/usr/local/mysql/data4/krsna.log --pid-file=/usr/local/mysql/data4/hostname.pid4 --skip-locking --port=3309 --socket=/tmp/mysql.sock4 mysql 6034 1.4 2.0 30304 10548 pts/6 D 13:15 0:00 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data2 --user=mysql --log=/usr/local/mysql/data2/krsna.log --pid-file=/usr/local/mysql/data2/hostname.pid2 --skip-locking --port=3307 --socket=/tmp/mysql.sock2 mysql 6036 1.5 2.0 30304 10548 pts/6 D 13:15 0:00 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data3 --user=mysql --log=/usr/local/mysql/data3/krsna.log --pid-file=/usr/local/mysql/data3/hostname.pid3 --skip-locking --port=3308 --socket=/tmp/mysql.sock3 krsna:/usr/local # kill -9 6036 krsna:/usr/local # mysql -S /tmp/mysql.sock4 Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 to server version: 4.0.20-standard-log Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> \q Bye krsna:/usr/local # mysql -S /tmp/mysql.sock2 Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 to server version: 4.0.20-standard-log Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> \q Bye krsna:/usr/local # mysql -S /tmp/mysql.sock3 ERROR 2002: Can't connect to local MySQL server through socket '/tmp/mysql.sock3' (2) krsna:/usr/local # ps aux|grep mysql root 25261 0.0 0.0 2092 4 pts/5 S Aug23 0:00 mysql mysql root 5955 0.0 0.2 2676 1200 pts/6 S 13:15 0:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --socket=/tmp/mysql.sock2 --port=3307 --pid-file=/usr/local/mysql/data2/hostname.pid2 --datadir=/usr/local/mysql/data2 root 5961 0.0 0.2 2676 1200 pts/6 S 13:15 0:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --socket=/tmp/mysql.sock4 --port=3309 --pid-file=/usr/local/mysql/data4/hostname.pid4 --datadir=/usr/local/mysql/data4 mysql 6032 0.1 2.0 30352 10644 pts/6 S 13:15 0:00 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data4 --user=mysql --log=/usr/local/mysql/data4/krsna.log --pid-file=/usr/local/mysql/data4/hostname.pid4 --skip-locking --port=3309 --socket=/tmp/mysql.sock4 mysql 6034 0.1 2.0 30352 10644 pts/6 S 13:15 0:00 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data2 --user=mysql --log=/usr/local/mysql/data2/krsna.log --pid-file=/usr/local/mysql/data2/hostname.pid2 --skip-locking --port=3307 --socket=/tmp/mysql.sock2 krsna:/usr/local # tail -7 /usr/local/mysql/data2/krsna.err 040817 17:29:09 mysqld ended 040825 13:15:23 mysqld started 040825 13:15:23 Warning: Asked for 196608 thread stack, but got 126976 040825 13:15:30 InnoDB: Started /usr/local/mysql/bin/mysqld: ready for connections. Version: '4.0.20-standard-log' socket: '/tmp/mysql.sock2' port: 3307 rsna:/usr/local # tail -10 /usr/local/mysql/data3/krsna.err 040825 13:16:33 mysqld restarted 040825 13:16:33 Warning: Asked for 196608 thread stack, but got 126976 040825 13:16:33 Can't start server: Bind on TCP/IP port: Address already in use 040825 13:16:33 Do you already have another mysqld server running on port: 3308 ? 040825 13:16:33 Aborting 040825 13:16:33 /usr/local/mysql/bin/mysqld: Shutdown Complete 040825 13:16:33 mysqld ended krsna:/usr/local # tail -7 /usr/local/mysql/data4/krsna.err 040817 17:29:08 mysqld ended 040825 13:15:23 mysqld started 040825 13:15:23 Warning: Asked for 196608 thread stack, but got 126976 040825 13:15:30 InnoDB: Started /usr/local/mysql/bin/mysqld: ready for connections. Version: '4.0.20-standard-log' socket: '/tmp/mysql.sock4' port: 3309 So, you kill -9 a threat of server #3 (we'll call it #3 in this case), and mysqld_safe tries to restart, but I assume the port is still in use because it's in the process of dying, so it gives up, and also doesn't try to kill any other mysqld process on any other port. My my.conf file is this: [mysqld_multi] mysqld = /usr/local/mysql/bin/mysqld_safe mysqladmin = /usr/local/mysql/bin/mysqladmin [mysqld2] socket = /tmp/mysql.sock2 port = 3307 pid-file = /usr/local/mysql/data2/hostname.pid2 datadir = /usr/local/mysql/data2 [mysqld3] socket = /tmp/mysql.sock3 port = 3308 pid-file = /usr/local/mysql/data3/hostname.pid3 datadir = /usr/local/mysql/data3 [mysqld4] socket = /tmp/mysql.sock4 port = 3309 pid-file = /usr/local/mysql/data4/hostname.pid4 datadir = /usr/local/mysql/data4
[2 Sep 2004 0:19]
Patrick Galbraith
With help from Serg, we found that there was an error in the regular expression that caused safe_mysqld to kill itself and not each mysqld thread: PROC=`ps xa | grep $ledir/$MYSQLD | grep -v "grep" | sed -n '$p'` changed to PROC=`ps xa | grep "$ledir/$MYSQLD\>" | grep -v "grep" | sed -n '$p'` This allowed it to manifest the original error that this bug has been logged for, which I've fixed by having logic test to see if $mysql_tcp_port set, and grepping to see if the port is in the ps list: if test -n "$mysql_tcp_port" then numofproces=`ps xa | grep -v "grep" | grep $ledir/$MYSQLD| grep -c "port=$mysql_tcp_port"` else numofproces=`ps xa | grep -v "grep" | grep -c $ledir/$MYSQLD` fi echo -e "\nNumber of processes running now: $numofproces" | tee -a $err_log I=1 while test "$I" -le "$numofproces" do if test -n "$mysql_tcp_port" then PROC=`ps xa | grep "$ledir/$MYSQLD\>" | grep -v "grep" | grep "port=$mysql_tcp_port" | sed -n '$p'` else PROC=`ps xa | grep "$ledir/$MYSQLD\>" | grep -v "grep" | sed -n '$p'` fi NOTE: mysqld_multi always starts mysqld_safe with a port set, so this is a good means of checking this