Bug #4361 unclosed sockets?
Submitted: 1 Jul 2004 11:30 Modified: 13 Aug 2004 1:26
Reporter: David Turnbull Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.0.12 OS:Linux (RH Linux 7.3)
Assigned to: CPU Architecture:Any

[1 Jul 2004 11:30] David Turnbull
Description:
"client" is a RH 7.3 box with mysql 4.0.12 on it, kernel 2.4.20. Its mysqld is active with an average of 1.6 queries per second for the last 2 days.

Every now and then any new connections to "server" (RH 7.3 box running mysql 4.0.18, k2.4.20, 2.4 queries/sec) will fail with "ERROR: 2013 Lost connection to MySQL server during query" after trying to connect for 10 seconds or so.

Other machines can connect fine, so I am inclined to believe it's a problem on this machine only, and then only with mysql connecting out.

netstat -an | grep 3306 shows hundreds (200-300) sockets, mostly in the state LAST_ACK.

I am not sure if all these sockets are the cause or effect of some problem that seems to block outgoing mysql connections to that other host.

Both machines are configured thusly:

[mysqld]
set-variable=max_allowed_packet=16M
set-variable=thread_stack=256k
skip-name-resolve
wait_timeout=150
thread_cache_size=40

Looking at the processlist on "client" and "server" with mysqladmin shows only 5-10 threads, and they have been cached so that only 50-100 have ever been created.

How to repeat:
Connect from one machine with mysql 4.0.12 to another with 4.0.18, do some queries, mysql_close(); exit, repeat a few hundred thousand times.

Suggested fix:
What has worked is shutting mysql down on "server", waiting for all the sockets on "client" to disappear, then wait 5 minutes or so and it starts working.
[15 Jul 2004 3:44] David Turnbull
Still happens irregularly.

on "client":
tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN
tcp        1     44 client:3306       server:53338      CLOSING
tcp        0     44 client:3306       server:54811      FIN_WAIT1
tcp        1     44 client:3306       server:52891      CLOSING
tcp        1     44 client:3306       server:53849      CLOSING
tcp        0     23 client:46133      server:3306       LAST_ACK
tcp        0     23 client:46164      server:3306       LAST_ACK
tcp        0     23 client:46167      server:3306       LAST_ACK
tcp        0     23 client:46197      server:3306       LAST_ACK
tcp        0     23 client:46210      server:3306       LAST_ACK
tcp        0     23 client:46218      server:3306       LAST_ACK
tcp        0     23 client:46233      server:3306       LAST_ACK
tcp        0     23 client:46258      server:3306       LAST_ACK
tcp        0     23 client:46264      server:3306       LAST_ACK
tcp        0     23 client:46275      server:3306       LAST_ACK
tcp        0     23 client:46281      server:3306       LAST_ACK
tcp        0     23 client:46288      server:3306       LAST_ACK
tcp        0     23 client:46324      server:3306       LAST_ACK
tcp        0     23 client:46326      server:3306       LAST_ACK
tcp        0     23 client:46340      server:3306       LAST_ACK
tcp        0     23 client:46343      server:3306       LAST_ACK
tcp        0     23 client:46368      server:3306       LAST_ACK
tcp        0     23 client:46389      server:3306       LAST_ACK
tcp        0     23 client:46403      server:3306       LAST_ACK
tcp        0     23 client:46420      server:3306       LAST_ACK
tcp        0     23 client:46439      server:3306       LAST_ACK
tcp        0     23 client:46434      server:3306       LAST_ACK

And so on for about 600 lines.
There are only 3-10 threads on both boxes.
[15 Jul 2004 3:58] David Turnbull
Updated status to non-critical and low priority.
Seem it's a network congestion related problem.
[13 Aug 2004 1:26] Hartmut Holzgraefe
Sockets in FIN_WAIT or LAST_ACK states indicate network
problems. These states are beyond an applications control.