Bug #52633 When building with -DNO_ALARM on Linux, wait_timeout is effective 10x longer
Submitted: 6 Apr 2010 20:49 Modified: 30 Jun 2011 22:15
Reporter: Ryan Mack Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S2 (Serious)
Version:5.1.45, 5.1 bzr OS:Linux (CentOS 5)
Assigned to: Paul DuBois CPU Architecture:Any
Tags: Contribution

[6 Apr 2010 20:49] Ryan Mack
Description:
In order to avoid mutex contention on the alarm queue, we switched to compiling on Linux with -DNO_ALARM.  This changes the behavior of the socket code to use SO_RCVTIMEO/SO_SNDTIMEO to drop idle connections after wait_timeout seconds.  In my initial tests it appeared that specifying a wait_timeout value of 2 seconds led to connections timing out after 20 seconds.  This looks to be a result of socket timeouts being treated as an error that should be retried up to the retry limit, in this case 10 times.

How to repeat:
Compile with -DNO_ALARM.  Set wait_timeout (or interactive_timeout) to 2 seconds.  Connect from two connections and watch the process list on one of them.  Note the other thread does not have its idle connection dropped for 20 seconds.

Suggested fix:
Change the socket IO retry logic to not treat SOCKET_EAGAIN as a retry-able error when compiling with -DNO_ALARM.  Here's the diff we're running with now:

--- a/sql/net_serv.cc
+++ b/sql/net_serv.cc
@@ -859,8 +859,24 @@ my_real_read(NET *net, size_t *complen)
            }
          }
 #endif /* (!defined(__WIN__) || defined(MYSQL_SERVER) */
-         if (thr_alarm_in_use(&alarmed) && !thr_got_alarm(&alarmed) &&
-             interrupted)
+         if (thr_alarm_in_use(&alarmed) &&
+#ifndef NO_ALARM
+             !thr_got_alarm(&alarmed) && interrupted
+#else
+              /* In the original behavior 'interrupted' is true if the socket
+                 error was EAGAIN, EINTR, or EWOULDBLOCK.  If this was the
+                 result of a timeout we would avoid retrying because
+                 thr_got_alarm was set.  When compiling with -DNO_ALARM,
+                 the_got_alarm is always false so we must treat EAGAIN and
+                 EWOULDBLOCK as an indication of a socket timeout and not
+                 retry.  We only retry in the case of EINTR, which may have
+                 been caused by a user-generated signal.  KILL CONNECTION is
+                 supported by compiling with -DSIGNAL_WITH_VIO_CLOSE which will
+                 interrupt the read by closing the socket instead of setting
+                 the now nonexistent timer to expire. */
+             vio_errno(net->vio) == SOCKET_EINTR
+#endif
+            )
          {                                     /* Probably in MIT threads */
            if (retry_count++ < net->retry_count)
              continue;
[7 Apr 2010 9:45] Sveta Smirnova
Thank you for the report.

Verified as described, although in my case it is 11x
[9 Apr 2010 17:17] Omer Barnir
This is the intended behavior: must fail once then it starts the retry countdown. (that's why 11x on the last comment)
[9 Apr 2010 17:28] Ryan Mack
I'm unclear why this isn't a bug.  When compiling the current code with -DNO_ALARM if I set wait_timeout to 10 seconds, the connections are kept open for 110 seconds.  If we are relying on the timeouts to disconnect idle connections after the specified time, I don't think it makes sense to retry the read when it times out.  An alternative fix to the one I provided would be to scale the timeout value down by a factor of 11.

Thank you, Ryan
[19 Apr 2010 20:14] Ryan Mack
Reopening because I still think this is a bug as described in my last comment.

Thank you,

Ryan
[23 Apr 2010 0:05] Mark Callaghan
The existing behavior is already confusing. Retrying the read multiple times means the per-read timeout is a multiple of the timeout specified by the user.
http://mysqlha.blogspot.com/2010/03/can-protocol-be-gpl.html
This confusing behavior (retry 3 times, timeout is 3X what you set) is documented and apparently here to stay.

Retrying 11 times is not documented. It is also silly. We should fix this now as MySQL 5.5 builds with -DNO_ALARM by default.
[23 Apr 2010 8:05] Sveta Smirnova
Set bug status to "Documentation", because I got explanation why server behaves in such a way.
[23 Apr 2010 14:57] Ryan Mack
Would you mind explaining why?

Thanks, Ryan
[23 Apr 2010 16:17] Mark Callaghan
This problem might have been fixed in 5.5.4-m3.

In 5.5.4, vio_should_retry() is defined as:

my_bool
vio_should_retry(Vio * vio)
{
  int en = socket_errno;
  /*
    man 2 read write
      EAGAIN or EWOULDBLOCK when a socket is a non-blocking mode means
      that the read/write would block.
    man 7 socket
      EAGAIN or EWOULDBLOCK when a socket is in a blocking mode means
      that the corresponding receiving or sending timeout was reached.
  */
  return  en == SOCKET_EINTR ||
          (!vio_is_blocking(vio) &&
            (en == SOCKET_EAGAIN || en == SOCKET_EWOULDBLOCK));
}

And in 5.1.45 it is

my_bool
vio_should_retry(Vio * vio __attribute__((unused)))
{
  int en = socket_errno;
  return (en == SOCKET_EAGAIN || en == SOCKET_EINTR ||
          en == SOCKET_EWOULDBLOCK);
}

------------------

This has a big effect on the start of loop in my_real_read. For the code below, when the client-side read timeout expires in 5.5.4-m3, vio_should_retry() returns 0 when there is a read timeout and vio_read() returned -1.

But in MySQL 5.1.45, vio_should_retry() returns -1 on a timeout and we then run into the code that Ryan changed to fix things. 

Maybe we need to fix vio_should_retry in 5.1?

    pos = net->buff + net->where_b;             /* net->packet -4 */
    for (i=0 ; i < 2 ; i++)
    { 
      while (remain > 0)
      { 
        /* First read is done with non blocking mode */
        if ((long) (length= vio_read(net->vio, pos, remain)) <= 0L)
        { 
          my_bool interrupted = vio_should_retry(net->vio);
[26 Apr 2010 20:20] Omer Barnir
Issue is fixed in 5.5. Will not be fixed in 5.1 as can have side effects
[30 Jun 2011 22:15] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

Added this explanation to the net_retry_count description for the 5.0/5.1 manuals (5.5 and up are not affected) and referred to it from the net_read_timeout and net_write_timeout descriptions:

On Linux, the NO_ALARM build flag (-DNO_ALARM) modifies how the 
binary treats both net_read_timeout and net_write_timeout. With this
flag enabled, neither timer cancels the current statement until after
the failing connection has been waited on an additional
net_retry_count times. This means that the effective timeout value
becomes (timeout setting) * (net_retry_count+1).