| Bug #37267 | connect() EINPROGRESS failures mishandled in client library | ||
|---|---|---|---|
| Submitted: | 8 Jun 2008 1:41 | Modified: | 4 May 2010 13:16 | 
| Reporter: | Kurtis Rader | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: C API (client library) | Severity: | S3 (Non-critical) | 
| Version: | 5.0.51b, 5.0, 5.1, 6.0 bzr | OS: | Linux | 
| Assigned to: | Ingo Strüwing | CPU Architecture: | Any | 
| Tags: | non-blocking connect einprogress poll | ||
   [11 Jun 2008 12:03]
   Susanne Ebrecht        
  Many thanks for writing a bug report. Please, provide your C code.
   [11 Jun 2008 18:31]
   Kurtis Rader        
  The original problem was observed using the Python bindings and involves confidential code. But based on strace output the following program simulates the the behavior of the MySQL client library when compiled with "#define BROKEN 1". When BROKEN is set to zero the suggested fix is used. Simply save as test.c, "make test", "./test". When recreating the problem you should see "read: Connection refused" printed which indicates the read() call returned -1 with errno == ECONNREFUSED. Note the poll() results in that case which show POLLIN is set but so is POLLHUP and POLLERR.
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <memory.h>
#include <errno.h>
#include <sys/fcntl.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/poll.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#define TIMEOUT_SEC 5
#define BROKEN 1
int client() {
        int fd, flags, res;
        in_addr_t ip_addr;
        struct sockaddr_in sock_addr;
        struct pollfd ufds;
        memset(&sock_addr, sizeof(sock_addr), 0);
        sock_addr.sin_family = AF_INET;
        sock_addr.sin_port = htons(65123);
        ip_addr = inet_addr("127.0.0.1");
        memcpy(&sock_addr.sin_addr, &ip_addr, sizeof(ip_addr));
        usleep(100000);
        fd = socket(AF_INET, SOCK_STREAM, 0);
        printf("client socket fd=%d\n", fd);
connect_again:
        flags = fcntl(fd, F_GETFL, 0);
        fcntl(fd, F_SETFL, flags | O_NONBLOCK);
        printf("client fd set nonblocking\n");
        res = connect(fd, (struct sockaddr *)&sock_addr, sizeof(sock_addr));
        if (res == 0) {
                printf("connect succeeded\n");
                return;
        }
        if (errno == ECONNREFUSED) {
                usleep(100000);
                goto connect_again;
        }
        if (errno != EINPROGRESS) {
                perror("connect");
                return;
        }
        printf("client fd set blocking\n");
        fcntl(fd, F_SETFL, flags & ~O_NONBLOCK);
        ufds.fd = fd;
#if BROKEN
        ufds.events = POLLIN | POLLPRI;
#else
        ufds.events = POLLOUT;
#endif
        printf("client connect in progress -- polling\n");
        res = poll(&ufds, 1, TIMEOUT_SEC * 1000);
        printf("poll() res=%d  revents=0x%x  "
               "POLLHUP=0x%x POLLERR=0x%x POLLIN=0x%x  POLLOUT=0x%x\n",
                res, ufds.revents,
                ufds.revents & POLLHUP, ufds.revents & POLLERR,
                ufds.revents & POLLIN, ufds.revents & POLLOUT);
        if (res == 0) {
                printf("client poll() timed out\n");
                return;
        }
#if BROKEN
        if (!(ufds.revents & POLLIN)) return;
#else
        if (ufds.revents & POLLHUP) goto connect_again;
        if (!(ufds.revents & POLLOUT)) {
                printf("client poll() did not return POLLOUT as expected\n");
                return;
        }
#endif
        printf("client calling read\n");
        res = read(fd, &flags, 1);
        if (res == -1) {
                perror("read");
                return;
        }
        printf("client read=%d\n", res);
        printf("client returning\n");
}
int server() {
        int fd, new_fd, opt;
        socklen_t addrlen;
        in_addr_t ip_addr;
        struct sockaddr_in sock_addr;
        memset(&sock_addr, sizeof(sock_addr), 0);
        sock_addr.sin_family = AF_INET;
        sock_addr.sin_port = htons(65123);
        ip_addr = inet_addr("127.0.0.1");
        memcpy(&sock_addr.sin_addr, &ip_addr, sizeof(ip_addr));
        fd = socket(AF_INET, SOCK_STREAM, 0);
        printf("server socket fd=%d\n", fd);
        opt = 1;
        setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
        if (bind(fd, (struct sockaddr *)&sock_addr, sizeof(sock_addr)) != 0) {
                perror("bind");
                return;
        }
        printf("server bound -- sleeping for 1 second\n");
        sleep(1);
        printf("server calling listen\n");
        if (listen(fd, 1)) {
                perror("listen");
                return;
        }
        printf("server calling accept\n");
        addrlen = sizeof(sock_addr);
        new_fd = accept(fd, (struct sockaddr *)&sock_addr, &addrlen);
        if (new_fd == -1) {
                perror("accept");
                return;
        }
        printf("server accepted connection\n");
        write(new_fd, "x", 1);
        printf("server returning\n");
}
int main() {
        if (fork()) {
                server();
        }
        else {
                client();
        }
        exit(0);
}
 
   [23 Sep 2008 7:58]
   Sveta Smirnova        
  Thank you for the feedback. Verified using code analysis and example provided.
   [15 Sep 2009 7:53]
   Ingo Strüwing        
  I stumbled over this when the test rpl_ssl1 failed on my Ubuntu machine with IPV6 and IPV6 enabled. The slave tried to connect over IPV6 first, which failed and should have tried over IPV4 thereafter. However, since wait_for_data() returned success, the second attempt was not done as mysql_real_connect() assumed, the first connection was successful. This was 100% repeatable on my machine. I have a patch for it, which I will publish soon.
   [15 Sep 2009 13:06]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/83276 3605 Ingo Struewing 2009-09-15 Bug#37267 - connect() EINPROGRESS failures mishandled in client library We cann connect() in a non-blocking mode to be able to specify a non-standard timeout. The problem was that we did not fetch the status from the non-blocking connect(). We assumed that poll() would not return a POLLIN flag if the connect failed. But on some platforms this is not true. After a successful poll() we do now retrieve the status value from connect() with getsockopt(...SO_ERROR...). Now we do know if (and how) the connect failed. The test case for my investigation was rpl.rlp_ssl1 on an Ubuntu 9.04 x86_64 machine. Both, IPV4 and IPV6 were active. 'localhost' resolved first for IPV6 and then for IPV4. The connection over IPV6 was blocked. rpl.rlp_ssl1 timed out as it did not notice the failed connect(). The first read() failed, which was interpreted as a master crash and the connection was tried to reestablish with the same result until the retry limit was reached. With the fix, the connect() problem is immediately recognized, and the connect() is retried on the second resolution for 'localhost', which is successful. @ libmysqld/libmysqld.c Bug#37267 - connect() EINPROGRESS failures mishandled in client library Changed a DBUG print string to distinguish the two mysql_real_connect() implementations in DBUG traces. @ mysql-test/include/wait_for_slave_param.inc Bug#37267 - connect() EINPROGRESS failures mishandled in client library Made timeout value available in error message. @ mysql-test/suite/rpl/r/rpl_get_master_version_and_clock.result Bug#37267 - connect() EINPROGRESS failures mishandled in client library Fixed test result. Connect error is now detected as CR_CONN_HOST_ERROR (2003) instead of CR_SERVER_LOST (2013). @ sql-common/client.c Bug#37267 - connect() EINPROGRESS failures mishandled in client library Added retrieval of the error code from the non-blocking connect(). Added DBUG. Added comment.
   [15 Sep 2009 19:08]
   Jim Winstead        
  one small problem: in my_connect() you added a DBUG_RETURN to the windows/netware path that doesn't have a corresponding DBUG_ENTER.
   [15 Sep 2009 20:03]
   Ingo Strüwing        
  Ouch. Yes. Thanks. I will fix it.
   [28 Sep 2009 16:54]
   Vladislav Vaintroub        
  Looks fine - good to push after fixing windows/netware DBUG_RETURN
   [29 Sep 2009 15:37]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/85068 3605 Ingo Struewing 2009-09-29 Bug#37267 - connect() EINPROGRESS failures mishandled in client library We cann connect() in a non-blocking mode to be able to specify a non-standard timeout. The problem was that we did not fetch the status from the non-blocking connect(). We assumed that poll() would not return a POLLIN flag if the connect failed. But on some platforms this is not true. After a successful poll() we do now retrieve the status value from connect() with getsockopt(...SO_ERROR...). Now we do know if (and how) the connect failed. The test case for my investigation was rpl.rlp_ssl1 on an Ubuntu 9.04 x86_64 machine. Both, IPV4 and IPV6 were active. 'localhost' resolved first for IPV6 and then for IPV4. The connection over IPV6 was blocked. rpl.rlp_ssl1 timed out as it did not notice the failed connect(). The first read() failed, which was interpreted as a master crash and the connection was tried to reestablish with the same result until the retry limit was reached. With the fix, the connect() problem is immediately recognized, and the connect() is retried on the second resolution for 'localhost', which is successful. @ libmysqld/libmysqld.c Bug#37267 - connect() EINPROGRESS failures mishandled in client library Changed a DBUG print string to distinguish the two mysql_real_connect() implementations in DBUG traces. @ mysql-test/include/wait_for_slave_param.inc Bug#37267 - connect() EINPROGRESS failures mishandled in client library Made timeout value available in error message. @ mysql-test/suite/rpl/r/rpl_get_master_version_and_clock.result Bug#37267 - connect() EINPROGRESS failures mishandled in client library Fixed test result. Connect error is now detected as CR_CONN_HOST_ERROR (2003) instead of CR_SERVER_LOST (2013). @ sql-common/client.c Bug#37267 - connect() EINPROGRESS failures mishandled in client library Added retrieval of the error code from the non-blocking connect(). Added DBUG. Added comment.
   [29 Sep 2009 15:48]
   Ingo Strüwing        
  Both reviewers agreed - setting this to "patch approved". Note that this patch will not go to any production version or release canditate. The risk of this patch to break existing applications has been estimated as high.
   [30 Sep 2009 10:28]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/85172 2880 Ingo Struewing 2009-09-30 Bug#37267 - connect() EINPROGRESS failures mishandled in client library We cann connect() in a non-blocking mode to be able to specify a non-standard timeout. The problem was that we did not fetch the status from the non-blocking connect(). We assumed that poll() would not return a POLLIN flag if the connect failed. But on some platforms this is not true. After a successful poll() we do now retrieve the status value from connect() with getsockopt(...SO_ERROR...). Now we do know if (and how) the connect failed. The test case for my investigation was rpl.rlp_ssl1 on an Ubuntu 9.04 x86_64 machine. Both, IPV4 and IPV6 were active. 'localhost' resolved first for IPV6 and then for IPV4. The connection over IPV6 was blocked. rpl.rlp_ssl1 timed out as it did not notice the failed connect(). The first read() failed, which was interpreted as a master crash and the connection was tried to reestablish with the same result until the retry limit was reached. With the fix, the connect() problem is immediately recognized, and the connect() is retried on the second resolution for 'localhost', which is successful. @ libmysqld/libmysqld.c Bug#37267 - connect() EINPROGRESS failures mishandled in client library Changed a DBUG print string to distinguish the two mysql_real_connect() implementations in DBUG traces. @ mysql-test/include/wait_for_slave_param.inc Bug#37267 - connect() EINPROGRESS failures mishandled in client library Made timeout value available in error message. @ mysql-test/suite/rpl/r/rpl_get_master_version_and_clock.result Bug#37267 - connect() EINPROGRESS failures mishandled in client library Fixed test result. Connect error is now detected as CR_CONN_HOST_ERROR (2003) instead of CR_SERVER_LOST (2013). @ sql-common/client.c Bug#37267 - connect() EINPROGRESS failures mishandled in client library Added retrieval of the error code from the non-blocking connect(). Added DBUG. Added comment.
   [30 Sep 2009 12:28]
   Ingo Strüwing        
  Queued to mysql-next-mr-bugfixing and mysql-6.0-backup-merge.
   [2 Oct 2009 17:20]
   Ingo Strüwing        
  Also merged to mysql-6.0-codebase-bugfixing.
   [9 Oct 2009 8:47]
   Bugs System        
  Pushed into 6.0.14-alpha (revid:alik@ibmvm-20091009083208-0o0f0i9w1sq3c1kn) (version source revid:ingo.struewing@sun.com-20091001194257-a6xrw9u0ccx24qgn) (merge vers: 6.0.14-alpha) (pib:12)
   [13 Oct 2009 13:19]
   Paul DuBois        
  Noted in 6.0.14 changelog. The client library mishandled EINPROGRESS errors for connections in nonblocking mode. Setting report to NDI pending push into 5.5.x.
   [16 Oct 2009 6:38]
   Tonci Grgin        
  Bug#47451 was marked as duplicate of this report.
   [25 Oct 2009 13:38]
   Bugs System        
  Pushed into 6.0.14-alpha (revid:alik@sun.com-20091025133616-ca4inerav4vpdnaz) (version source revid:ingo.struewing@sun.com-20090930110144-qnqbtc3kfsr1200h) (merge vers: 6.0.14-alpha) (pib:13)
   [25 Oct 2009 20:54]
   Paul DuBois        
  Setting report to NDI pending push into 5.5.x.
   [12 Nov 2009 8:21]
   Bugs System        
  Pushed into 5.5.0-beta (revid:alik@sun.com-20091110093229-0bh5hix780cyeicl) (version source revid:alik@ibmvm-20091009130916-0ijstmmz3efzx20g) (merge vers: 5.5.0-beta) (pib:13)
   [12 Nov 2009 18:46]
   Paul DuBois        
  Noted in 5.5.0 changelog.
   [15 Apr 2010 11:38]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/105724
   [29 Apr 2010 9:43]
   Bugs System        
  Pushed into 5.1.44-ndb-7.0.15 (revid:magnus.blaudd@sun.com-20100429073554-0wswepmdyx9mibt2) (version source revid:magnus.blaudd@sun.com-20100429073554-0wswepmdyx9mibt2) (merge vers: 5.1.44-ndb-7.0.15) (pib:16)
   [4 May 2010 13:16]
   Jon Stephens        
  Also documented in the NDB-7.0.15 and NDB-7.1.4 changelogs. Amended changelog entry to read as follows:
        The MySQL client library mishandled EINPROGRESS errors for 
        connections in nonblocking mode. This could lead to replication 
        failures on hosts capable of resolving both IPv4 and IPv6 network 
        addresses, while trying to resolve localhost.
Closed.
SEEALSO BUG#44344.
 
Description: The my_connect() and wait_for_data() functions do not correctly handle non-blocking connect() calls that occur in the window between the bind() and listen() calls on the destination end-point. When that race occurs the connect() calls returns -1 with errno == EINPROGRESS. The wait_for_data() function then polls the socket for readability. Unfortunately the socket is in an error state so pollfd.revents has POLLERR, POLLHUP and POLLIN set. Since wait_for_data() only looks for POLLIN it reports success and the subsequent read() on the socket fails with errno == ECONNREFUSED. How to repeat: Use SSH to tunnel the connection. Wrap the ssh call with a script similar to this one: #!/bin/sh sleep 1 exec /usr/bin/ssh "$@" Then check whether ssh is ready to tunnel the MySQL connection by rapidly calling connect() on the local ssh end-point. When it returns success immediately call the MySQL connection function. With this setup I can hit the race greater than 99% of the time on my workstation. Without the artificial ssh startup delay and tight polling this race happens infrequently since ssh typically reaches the listen() call on the local end-point of the tunnel before the MySQL connect is attempted. But on a heavily loaded system the failed connections happen often enough (around 0.5% of the time) to be annoying. Suggested fix: There are a couple of problems. First is the use of pollfd.events == POLLIN. The correct flag is POLLOUT. The select() version of wait_for_data() gets this correct by passing NULL for the readfds parameter, and a bitmask containing the socket fd for the writefds parameter. The other problem is that this is a very low probability race condition which can be correctly handled by simply closing the socket and going back to the top of my_connect() and issuing the connect() again. But note that to be 100% correct you need to catch ECONNREFUSED on the second thru n'th attempts because that will be returned until the target calls listen(). Alternatively, it may be adequate to insert a short sleep in the expectation that by the time the second connect() is performed the end-point will have transitioned to the LISTEN state.