Bug #31109 client on Vista does not report 'lost connection'
Submitted: 20 Sep 2007 12:58 Modified: 3 Aug 2009 15:11
Reporter: Peter Laursen (Basic Quality Contributor) Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: C API (client library) Severity:S3 (Non-critical)
Version:any OS:Microsoft Windows (Vista)
Assigned to: Vladislav Vaintroub CPU Architecture:Any
Tags: qc
Triage: D3 (Medium)

[20 Sep 2007 12:58] Peter Laursen
Description:
I think we at Webyog have bumped into MySQL client code issue with Vista.

We can reproduce the issue through command line 

1)       connect  through commandline to a MySQL server which is present in remote machine.

2)       Execute “select * ” query to a Large table (100.000+ rows so that it takes time to render resultset)

3)       While it is executing, disable the network connection (disconnect cable for instance).

In XP it is *working fine*   It is coming out with an error “Lost connection During query”

But in Vista it not coming out.. neither it throws up an error nor it stops executing.

It also affects our SQLyog program compiled with libmysql.dll (and probably other programs like Query Browser etc. too)

How to repeat:
start bying a heavy pair of scissors and continue as described above :-)

Suggested fix:
no idea!
[20 Sep 2007 16:11] Miguel Solorzano
Thank you for the bug report. Verified on Windows Vista 64-bit.
[18 Dec 2007 10:24] Peter Laursen
I tried with the Service Pack 1 (RC) for Vista.
http://www.microsoft.com/downloads/details.aspx?familyid=2FA54665-7716-47D9-B9B7-5CFDF7119...

Now it seems to work OK!
[18 Dec 2007 10:27] Peter Laursen
Proper error message appearing in SQLyog

Attachment: waslost.gif (image/gif, text), 17.97 KiB.

[19 Dec 2007 22:40] Iggy Galarza
Thank you for the update.  Marking 'Not a Bug' since this was resolved by an OS update.
[27 May 2008 12:30] Peter Laursen
Reopening.  Problem is back again!

Also this http://bugs.mysql.com/bug.php?id=36994 was reported now!

System: Vista 32 bit Servicepack1.
[24 Jul 2009 0:17] Vladislav Vaintroub
I think this one qualifies as OS bug rather than mysql bug. The client stucks in recv()/WSARecv() and I do not see that we do something wrong in TCP/IP.

Googling for a solution/workaround did not brought much. There are 3 hints I that could be useful.
a) set SO_KEEPALIVE option on the socket. We do that already, does not help.
b) use select() prior to recv(). Tried, does not bring anything, select() hangs in the same way as recv().
c) use timeout on socket reads. This one is actually works and possible in mysql client library user via mysql_options with  MYSQL_OPT_READ_TIMEOUT.

I propose c) as a workaround and closing this as "Won't fix".
[24 Jul 2009 2:05] Vladislav Vaintroub
FYI, Win2008 R2 (and most probably Win7 too) does not  seem to exhibit the hanging behavior and reports "Lost connection... during query"  at the very same moment I disable the network adapter (which was my test case)
[24 Jul 2009 9:25] Peter Laursen
I do not agree. Not unless all possbilities have been tried.
First of all: have you been in dialogue with Microsoft about this?

If you support multiple platform you will need to find solutions or workarounds for 'platform issues'. It is exactly the same as when you develop a web application where you wil have to find solutions or workarounds for 'browser issues'.
[24 Jul 2009 17:47] Vladislav Vaintroub
filed a bug for winsock problem.
https://connect.microsoft.com/WNDP/feedback/ViewFeedback.aspx?FeedbackID=476939
[24 Jul 2009 17:48] Vladislav Vaintroub
The workaround is available and I described it previously : use read timeout parameter in mysql client library.
[24 Jul 2009 18:15] Peter Laursen
OK .. finally! But shouldn't it have been 20 months ago?
[24 Jul 2009 18:20] Peter Laursen
but we will check read timeot in mysql_options.
[24 Jul 2009 19:53] Peter Laursen
Well .. also on Windows you may create a 'mysql' user and assign what privileges you want to him, I think. Actually Cygwin SSHD 1.7 does something like that (I think user is named 'privileged server'). Cygwin 1.5 SSHD service won't start on Windows after Vistas SP1/SP2 (not sure) but 1.7 runs fine on my Win7 system

I agree that porting Unix program to Windows has some problems - in particular after Vista+.  MySQL is not the only one. Most notably Apache has exactly the same problems on Windows (SYSTEM privileges, configuration stored in 'Program Files').

Basically I believe Unix vendors providing Windows ports have not understood that with recent Windows fine grained privileges control is possible - almost like on Unix.  MySQL understanding of Windows is based on W98 and NT4.

However this bug report was primarily about the confusion occuring if UAC is ON and configuration changes have no effect if not configuration is edited with explicit admin privileges.  For that reason I think that my.ini should be in same position as /datadir.
[24 Jul 2009 19:54] Peter Laursen
oops .. please ignore my last post.  I psoted the wrong place.
[24 Jul 2009 22:39] Miguel Solorzano
@Peter,

Hi, You wrote:

"[24 Jul 20:20] Peter Laursen 
but we will check read timeot in mysql_options."

Any feedback about results?. Thanks in advance.
[25 Jul 2009 7:43] Peter Laursen
I am not a programmer myself.  I forwarded to the SQLyog developer team that they should look into it. They will probably do by beginning of next week.

We use the connect_options API already so if it is so simple that we can add a parameter and thus set read timeout in the client to 5 seconds or something like that it will probably be a usable workaround.
[30 Jul 2009 9:48] Peter Laursen
Yes it seems to work. 

With MYSQL_OPT_READ_TIMEOUT = 2 sec 'lost connection is reported by the client 5+ seconds after connection was physically lost on Vista.

I was afraid that executing long-running queries would not return result with the setting.  But result come OK.
[30 Jul 2009 10:55] Vladislav Vaintroub
Not sure that timeout of 2 could be generally used. For _long_ running queries like "select sleep(10)" it is of course not big enough.

After some research I think that my prior claim that not detecting network disconnects would be an OS/software error is wrong. It only shows my lack of understanding how TCP works;) See for example discussion here http://forums.sun.com/thread.jspa?threadID=539297&start=0&tstart=0
[3 Aug 2009 15:11] Vladislav Vaintroub
Closing the bug with "Won't fix".
Reasons:
1) there is already an acceptable workaround with read timeout.
2) we cannot influence how TCP works (TCP implementations are not required to detect disconnects immediately).
3) I opened a feature request for C API "Bug#46531 support keepalive timeout option ". This could help detecting broken connections earlier.