Bug #50453 Wokbench 5.2.11b does not report server status correctly
Submitted: 19 Jan 2010 19:44 Modified: 29 Jan 2010 14:17
Reporter: Omar Zakaria Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Workbench Severity:S2 (Serious)
Version:5.2.11b OS:MacOS (10.6.2)
Assigned to: Maksym Yehorov CPU Architecture:Any

[19 Jan 2010 19:44] Omar Zakaria
Description:
Workbench reports the MySQL server on a remote machine as being down, even though I'm able to run queries against it, both from Workbench itself and from the CLI.

While setting the server up using the wizard, I noticed the following was appearing in the logs (I've pruned some whitespace for ease of reading and added the dashed to demarcate the output): 

--- 

Connecting to apasrv02 
Loading keys from /Users/ozakaria/.ssh/id_rsa 

Connected via ssh to apasrv02 
connected. 

SSH.exec_cmd( uname - Linux 0 ) 

OK, Operating System is 'Linux' 

Checking command 'ps -C mysqld -o pid=' 

SSH.exec_cmd( ps -C mysqld -o pid= - 11868 0 ) 
Server detected as stopped 
Check if /etc/my.cnf exists 

---- 

apasrv02 is where the mysql database exists. It's a RedHat EL4 machine running mysql 5.0.19. When I run "ps -C mysqld -o pid=" on it, I get 11868 back, which I've confirmed is indeed the pid of mysqld process. However, as noted above, this seems to translate to "stopped" instead of "running." 
Furthermore, if I instead use the MySQL Database Admin tool, I'm able to connect to the server without problems. I'm not really sure what's going on, here. In the console, I can see the following as output: 

[0x0-0x8ab8ab].com.sun.MySQLWorkbench[21593]	SSH.exec_cmd( /bin/bash -c "ps -C mysqld -o pid=" - None ) 

And when I run the command on the server, again, I get the correct pid (11868).

How to repeat:
1) Download and install Workbench 5.2.11b for OS X 10.6.2
2) Configure Workbench to administer a remote server over SSH
3) Attempt to open the administration interface. 

Suggested fix:
At the moment, the fix is to edit the wb_admin_control_be.py file's is_running() method to always return "true." 

My is_running() method looks like this:

   def is_running(self, silent = False):
        ret = False
        serverInfo = self.settings.serverInfo
        script = serverInfo["sys.mysqld.status"]
        result = self.execute(script, 0)

	return True
[20 Jan 2010 8:35] Johannes Taxacher
Hi Omar,

in the administration profile, on the tab "System Profile"  you can find a textfield for the setting called "Check MySQL Status". This setting holds a command that will be executed by WB on the remote server to determine the status of the mysql server in question. For linux servers, we fill it per default with "ps -C mysqld -o pid=".
Please check if that command, executed manually over ssh or terminal session on your DB server does return any output (the output in that case should be the processID, but for detection, the actual content of the output doesn't matter). In case that command doesn't provide any output, you'll have to adapt the command to your situation or replace it with any script/command that returns an output when server is running, and no output when server-daemon is down.
[20 Jan 2010 18:25] Omar Zakaria
Ah, I'm sorry, I mentioned this in the forums but perhaps not in the bug report: when I run "ps -C mysqld -o pid=" on the remote machine, it does indeed correctly return the PID for the mysql server instance.
[21 Jan 2010 8:43] Valeriy Kravchuk
Please, check with a newer version, 5.2.14, and inform about the results.
[22 Jan 2010 1:13] Omar Zakaria
The problem still exists after the upgrade.

I think I've traced it a little better, now. The problem seems to be that the script times out when trying to read the result of the command sent over SSH (line 190: ssh_session.recv...) . I increased the timeout to 100 from 5, but that didn't seem to have an effect.

After some debugging and playing around with paramiko itself, it seems the problem is the command being sent to the remote server isn't just 

ps -C mysqld -o pid=

It's actually

/bin/bash -c 'ps -C mysqld -o pid='

The "/bin/bash -c" is added by the wrap_command() method in wb_admin_control_be.py. For some reason, without this prefix, the ps command works just fine and I can administer the remote server (i.e. removing the call to wrap_command() solves my problem completely). With this prefix, however, the connection times out.

Even from the CLI, running 

ssh ozakaria@apasrv02 "/bin/bash -c 'ps -C mysqld -o pid='" 

results in a timeout. 

Quite a bizarre effect.

(As a side note, the "/bin/bash -c" wrap has done something bizarre and strange to my ssh keys. The key I normally use to log into apasrv02 now requires a passphrase when previously, it did not; when generating the key, I explicitly set the passphrase to be empty. This may be a clue; I suspect the reason the command is timing out is because it's trying to authenticate using the dsa key I have, which key for some reason does not work.)

Hope this helps.
[22 Jan 2010 10:36] Susanne Ebrecht
Omar,

do you agree with me that this is more related to SSH then to beeing a Workbench bug?
[22 Jan 2010 17:40] Maksym Yehorov
Omar,
I tried to repeat but failed. From ubuntu to Fedora 12 via ssh key with no password. Can start/stop server. Status is queried as expected.

Can you give us some more info?
Can you connect to the remote box via ssh using keys?
When Workbench asks for password, does it ask for password to unlock keys or for password to the remote box?
[22 Jan 2010 22:37] Omar Zakaria
Susanne,
I do not agree it's an ssh error. The original problem -- disregarding the shenanigans with my ssh keys --  is that the Workbench builds a command to check the server's status and then executes the command over SSH, but the command never returns and the request then times out. The original command is 'ps -C mysqld -o pid=', which gets turned into "/bin/bash -c 'ps -C mysqld -o pid='". If I remove the code that that adds the '/bin/bash' wrapper, everything works just fine.

Like I said, the "/bin/bash -c" prefix gets added by the wrap_command() method. Removing the call to wrap_command() before invoking exec_cmd() (I think it's line 694 in wb_admin_control_be.py) solves that particular problem. Actually, a better solution would be to edit the wrap_command method so that it looks like this: 

def wrap_command(self, cmd, sudo):
      wcmd = None

      if not self.is_windows():
        if sudo:
	  wcmd = "sudo " + wcmd
      else:
        wcmd = "cmd.exe /C \"" + cmd + "\""

      return wcmd

(Notice there is no line adding "/bin/bash" etc. to the front of the inputted command.)

As to __why__ prepending the command with "/bin/bash" fails, I am not certain. I've run the command from the command line like so:

ozakari@horn:> ssh ozakaria@apasrv02 "/bin/bash -c 'ps -C mysqld -o pid='"

And then opened another SSH session to apasrv02 and ran 

ozakaria@apasrv02:> ps ax | grep 'mysqld -o pid'

Which returned 

11489 ?            Ss      0:00 ksh -c /bin/bash -c 'ps -C mysqld -o pid='
11490 ?            S        0:00 /bin/bash -c ps -C mysqld -o pid=
11512 pts/0    S+     0:00 grep mysqld -o pid

I do not think that's what the command was intended to do. Notice that it's spawned a ksh instance which then spawned a bash instance. When I run the same ps command a minute later, I get the same result. Clearly, the processes are hung. This seems to happen on every RHEL4 and RHEL5 machine I try this on.

My solution, as I proposed above in my change to wrap_command, works just fine. Furthermore, since an SSH command session launches the users's shell (otherwise you wouldn't be able to use it as a command session!), there is no reason I can think of to wrap any command in a '/bin/bash -c' invocation. The solution should therefore be universal.

Maksym,
Yes, I can ssh to the remote box using my key. I can't seem to recreate the key authentication problem I was having earlier. I'm not sure what was causing it. If it happens again, I'll pay more attention to what I did.

All,
Thank you for your patience. I realize this sort of bug is frustrating to deal with.
[22 Jan 2010 22:45] Omar Zakaria
Ack, my bad. The corrected wrap_command() code should be:

def wrap_command(self, cmd, sudo):
      wcmd = None

      if not self.is_windows():
        if sudo:
          wcmd = "sudo " + cmd
        else:
          wcmd = cmd
      else:
        wcmd = "cmd.exe /C \"" + cmd + "\""

      return wcmd
[25 Jan 2010 14:09] Maksym Yehorov
Omar,

we used /bin/bash prefix to resolve paths, otherwise ps was not found. Thanks to your bug report we rechecked that approach, and it works with and without /bin/bash prefix due to the fact that we spawn a shell in the remote session. We'll do some extended testing. But likely prefix will go away. Thanks for such detailed bug report.

Fixed in revno 5023
[28 Jan 2010 16:31] Johannes Taxacher
fix will be included in 5.2.15
[28 Jan 2010 18:16] Omar Zakaria
Excellent! Thank you very much, guys!

I'm happy to be of help.
[29 Jan 2010 14:17] Tony Bedford
An entry has been added to the 5.2.15 changelog:

MySQL Workbench reported the remote server as being down, in the Database Server Status section of the Administrator, even though the server was in fact running, and queries could be successfully run against the database using MySQL Workbench.