Bug #53829 Server administration loop: Server is running, Server is stopped, Server is runn
Submitted: 20 May 2010 0:41 Modified: 6 Jun 2011 17:13
Reporter: Jeremy Bell Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Workbench: Administration Severity:S2 (Serious)
Version:5.2.33 OS:Windows (Windows 7 x64)
Assigned to: Maksym Yehorov CPU Architecture:Any
Tags: loop, server is running, server is stopped

[20 May 2010 0:41] Jeremy Bell
Description:
When connecting to remote server using standard TCP/IP, paired with SSH based administration using SSH key file, the administration status continually drops in and out of connectivity to the server. Here is a snippet from the Startup Message Log:

-----------------------------------------

2010-05-20 12:09:10 - Checked server status: Server is stopped.
2010-05-20 12:09:13 - Server is running
2010-05-20 12:09:13 - Server is stopped
2010-05-20 12:09:13 - Server is running
2010-05-20 12:09:14 - Server is stopped
2010-05-20 12:09:21 - Server is running
2010-05-20 12:09:24 - Server is stopped
2010-05-20 12:09:27 - Server is running
2010-05-20 12:09:37 - Server is stopped
2010-05-20 12:09:40 - Server is running
2010-05-20 12:09:47 - Server is stopped
2010-05-20 12:09:50 - Server is running
...

-----------------------------------------

This continues indefinitely. Making it impossible to effectively use the administration section of workbench.

How to repeat:
Use the administration panel to connect to a remote server using standard TCP/IP, paired with SSH based administration using an SSH key file. After doing this, you should see connectivity symptoms.
[21 May 2010 9:45] Susanne Ebrecht
Which OS is your server running?

Are there more then one instance running on the machine?
[21 May 2010 11:34] Jeremy Bell
The server is running Fedora Core 8.

Here is the two mysql processes running. Both of them appear when I start the server. I'm pretty sure there's just one main "instance" of MySQL running though. If there's more than one, it's not on purpose.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     21631  0.0  0.0   2600  1056 ?        S    May18   0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --pid-file=/var/lib/mysql/domU-12-31-39-00-6C-04.pid
mysql    21767 14.2 35.3 1373016 631600 ?      Sl   May18 663:25 /usr/sbin/mysqld --basedir=/ --datadir=/var/lib/mysql --user=mysql --log-error=/var/lib/mysql/domU-12-31-39-00-6C-04.err --pid-file=/var/lib/mysql/domU-12-31-39-00-6C-04.pid
[25 May 2010 12:14] Susanne Ebrecht
Please execute the following command on your server:

ps -C mysqld -o pid=

Will you get more then one result?
[25 May 2010 12:23] Susanne Ebrecht
Do you using IPv6 network?
[25 May 2010 14:09] Jeremy Bell
Executing that command returned a single pid, 21767 in this instance. If there's anything else I can do to help, I'd be happy to.
[25 May 2010 14:11] Jeremy Bell
I connect to my server using ipv4. I'm not sure if ipv6 is supported. It's Amazon's EC2 where the server is located, so you may be able to find that info in their documentation. It's a virtual server, if that helps.
[17 Jun 2010 13:56] Mike Lischke
I have seen this myself, but not with the last two releases, so this might have been fixed already.
[17 Jun 2010 14:13] Alfredo Kojima
Can you please confirm whether 5.2.22 solves the issue as Mike suggests?
[17 Jun 2010 21:58] Jeremy Bell
I no longer get the repetitive messages, but I believe the underlying problem still exists.

The monitor charts still drop in and out. For a couple of seconds, they display the nice, blue timelines scrolling across, then they display "No Data" and the blue charts fade away, then a couple of seconds later they appear again, and so on. This used to happen in sync with the messages about the server being stopped, running, stopped, etc.

It's as if the symptom of the problem was fixed (the repetitive messages) but the underlying problem clearly still exists.
[23 Jun 2010 10:52] Susanne Ebrecht
Which server version are you using?
[23 Jun 2010 10:55] Jeremy Bell
Server version: 5.1.46
[23 Jun 2010 11:53] Maksym Yehorov
Jeremy,

How fast is your ssh connection? Can it experience some slowdowns or stalls?
Not that the problem is in connection, rather slow connection may trigger that behaviour.
[23 Jun 2010 15:09] MySQL Verification Team
Could you please try version 5.2.24. Thanks in advance.
[23 Jun 2010 15:52] Jeremy Bell
Nothing has changed in relation to this problem in 5.2.24.

My SSH connection is fine. When working with an SSH terminal, everything is smooth. The issue with Workbench, looking at the graphs, "No data" appears, and then disappears, in a very consistent fashion in terms of timing. It's not like the connection is randomly dropping in and out.
[21 Jul 2010 1:01] Roel Van de Paar
Jeremy, having both a mysqld_safe and mysqld is fine. One is a wrapper. 

Please send your full error log (.err) and your my.cnf config file. First thing to eliminate is if your server is really crashing or not. If you want you can add these files in a private message.

Also, can you start a continuous ping through the tunnel and see if it always gets a reply (run for an hour orso)? You may also have to increase the packet size of the ping to get it to fail (see the ping options), but if it fails, it may indicate that the connection is unstable.

However, if that would be the case, I am not sure yet as to how much that warrants displaying "no data".

It would also be great if you could have a quick test of 5.2.25.
[21 Jul 2010 1:46] Jeremy Bell
my.cnf

Attachment: my.cnf (application/octet-stream, text), 4.67 KiB.

[21 Jul 2010 1:51] Jeremy Bell
Yes, it still happens in 5.2.25. I've uploaded the my.cnf file. There was nothing inside the .err, or .err-old files.

It sounds unlikely that the server is crashing. We run a fair amount of traffic 24 hours a day through our site. E.g. right now there's 1.5-3.0MB/s running through the server according to the graphs.

I'll run the ping now.
[21 Jul 2010 1:57] Jeremy Bell
Just realised I don't know how to ping the server through an SSH tunnel. I know how to ping the server using cmd.exe, but then that's using a different connection method since I have to change the EC2 firewall to allow pings through, whereas SSH always works. Forgive me for asking, as I've had a quick google, but how do I ping the SSH tunnel itself?
[21 Jul 2010 2:20] Roel Van de Paar
Jeremy, please send the output from:

mysql> \s

Run at your mysql prompt.

In regards the ping, I am not sure I am following this "that's using a different connection method since I have to change the EC2 firewall to allow pings through, whereas SSH always works"

Please ping the server through the ssh tunnel. For instance, if you were on workstation A, and the server was named server B, and the tunnel looked like this: workstation A > SSH tunnel > server B, then the ping would simply be initiated from workstation A to the IP or DNS name of server B. Basically, it is doing exactly the same as connecting to a remote server with any other software. If an SSH tunnel is already established (and if you can use other software to access the server, like the mysql client for instance), there should be no need to open a firewall as the SSH tunnel already goes "through it".
[21 Jul 2010 2:45] Jeremy Bell
Ok, pinging. When I open cmd.exe, and ping the server which is in the Amazon EC2 cloud (and I'm in NZ), it doesn't work. I have to go to the EC2 control panel, and  add an ICMP rule. I don't however, have to do this to connect via SSH (Putty) or via MySQL Workbench. That's what leads me to believe they're two different connection types.

So... I'm still confused as to what you want me to actually do. A) Open the firewall, and then I can use cmd.exe and run a ping command to the remote server IP address, or B) Open a connection to the remote server using putty, and run a ping command that way, but then since I'm already connected to the server through putty, I would just use 127.0.0.1 right?
[21 Jul 2010 3:52] Roel Van de Paar
Jeremy, I discussed the same with a colleague.

My bad; ping is ICMP, ssh tunnels (by default) TCP, and you cant (easily) push ICMP through a tunnel.

We're looking at another way to properly test your tunnel. In the mean time, could you SCP a large file through it to get the throughput speed?

Form the \s output it's clear the server is running stable, so that's excluded now.
[21 Jul 2010 5:43] Roel Van de Paar
Jeremy, could you clarify what way you are connecting to your server:

A. Directly/EC2 firewall exlusion for your IP:3306
B. Via Putty/port mapping local:someport->server:3306
C. Via WB SSH's feature

Also, it looks like there are some tools for testing ssh on Linux, but not so for Windows. 

One thing you could do, is something like this:

------ go.cmd (create in your mysql bin directory on your workstation)
@echo off
cls
echo Running... Press CTRL+C to interrupt, then type: type out.txt to view the results (times should match up well)
echo ---- >> out.txt
echo.|time|find "cur" >> out.txt
mysql -uroot -e"select current_time()\G" | find "cur" >> out.txt
sleep 5
go.cmd
------ 

If the time is different on your workstation, set it to the same time as the server first (seconds must match to get an easy to read output)

And run this for about 5 minutes *while you can see the issue happening*. Then check out.txt. You really should see very small differences (0-1 second, maybe an occastional 2 seconds) between the times.
[29 Jul 2010 8:56] Jeremy Bell
I uploaded a 18.7MB file using pscp. It started off at around 250kB/s, and then settled at around 28kB/s for the duration of the transfer. I didn't time it, but it took around 10 minutes.

How do I usually connect to our server?
A) Sometimes via PuTTY/SSH depending on what I'm doing. There is a permanent rule allowing SSH from any IP since we use private-key files.
B) But for MySQL it's mostly using WB, using the TCP over SSH feature with a private key.
C) Occasionally, I use TCP (no SSH) in WB. We have a dynamic rule in our firewall which opens up TCP port 3306 for whichever IP I last signed-in to our game from so that if I ever need a TCP connection it's open wherever I am. This particular bug happens with both TCP and SSH.

Finally, I ran your cmd script for 5 minutes, after syncing both the server and my workstation with a mutual time server. As you said most times were 1 second apart, some were 2, there was one 3 second, and one 4 second difference. Does that tell you anything? The whole time, I was getting a pretty consistent - scrolling chart - no data - scrolling chart - no data - scrolling chart etc.
[29 Jul 2010 9:00] Jeremy Bell
I decided to try using the Admin panel using a normal TCP connection (no SSH), but still using SSH for the server control obviously. And after the charts come on and start scrolling, they disappear with "no data" appearing, and then I get a C++ error. This happens every time I connect using a TCP only connection. I'll upload a screenshot shortly.
[29 Jul 2010 9:02] Jeremy Bell
Error a few seconds after opening Admin panel using TCP only connection.

Attachment: wb.png (image/png, text), 62.26 KiB.

[29 Jul 2010 10:06] Jeremy Bell
If it helps...

I tried using a different connection by disabling my broadband and tethering the 3G connection from my iPhone, same problem.

I also installed WB 5.2.25 on a different PC, same problem.
[10 May 2011 16:46] MySQL Verification Team
Could you please try version 5.2.33. Thanks.
[11 May 2011 0:37] Jeremy Bell
This is now resolved in 5.2.33
[6 Jun 2011 17:13] Armando Lopez Valencia
Closing this defect as per reported comment "[11 May 2:37] Jeremy Bell
This is now resolved in 5.2.33"