Bug #48929 Error in Accept() if using many file descriptors
Submitted: 20 Nov 2009 7:26 Modified: 7 Mar 2010 20:07
Reporter: Peter Zaitsev (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:all OS:Linux
Assigned to: Magne Mæhre CPU Architecture:Any
Tags: qc

[20 Nov 2009 7:26] Peter Zaitsev
Description:
If running with innodb_file_per_table,  innodb_open_files=20000  (some large number) large number of tables and when crash recovery is performed (so Innodb opens these files on startup)  MySQL may be flooding error log with messages like this:

091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable
091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable
091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable
091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable 

Strace of the process shows the following:

How to repeat:
[percona@test9 msb_5_4_2]$ strace -f -p 19229
Process 19229 attached with 23 threads - interrupt to quit
[pid 19286] rt_sigtimedwait([HUP QUIT ALRM TERM TSTP],  <unfinished ...>
[pid 19285] futex(0x165962ec, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19284] select(0, NULL, NULL, NULL, {0, 381000} <unfinished ...>
[pid 19283] select(0, NULL, NULL, NULL, {0, 551000} <unfinished ...>
[pid 19248] futex(0x1781193c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19247] futex(0x178118bc, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19246] futex(0x1781183c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19245] futex(0x178117bc, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19244] futex(0x1781173c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19243] futex(0x178116bc, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19242] futex(0x1781163c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19241] futex(0x178115bc, FUTEX_WAIT_PRIVATE, 319, NULL <unfinished ...>
[pid 19240] futex(0x1781153c, FUTEX_WAIT_PRIVATE, 3, NULL <unfinished ...>
[pid 19239] futex(0x178114bc, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished ...>
[pid 19238] futex(0x1781143c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19237] futex(0x178113bc, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19236] futex(0x1781133c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19235] futex(0x178112bc, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished ...>
[pid 19234] futex(0x1781123c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19233] futex(0x178111bc, FUTEX_WAIT_PRIVATE, 207, NULL <unfinished ...>
[pid 19232] futex(0x1781113c, FUTEX_WAIT_PRIVATE, 165, NULL <unfinished ...>
[pid 19231] futex(0x178110bc, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] fcntl(16392, F_SETFL, O_RDWR) = 0
[pid 19229] fcntl(16392, F_SETFL, O_RDWR) = 0
[pid 19229] select(16394, [1025 1040 1042 1044 1050 1051 1053 1056 1057 1058 1059 1060 1061 1062 1080 1153 1169 1171 1172 1174 1175 1176 1177 1178 1181 1184 1185 1186 1187 1188 1189 1190 1208 1380 1414 1415 1416 1420 1421 1422 1426 1427 1429 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1479 1480 1484 1485 1486 1490 1491 1493 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1536 1670 1671 1672 1676 1677 1678 1682 1683 1685 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 ....

Suggested fix:
I have a feeling it has something to do with using relatively high number for socket file descriptor with select and accept. 

This is CentOS 5.3
[20 Nov 2009 8:32] Kristian Nielsen
The problem here is that one of the server sockets ends up with file descriptor 16393, which is bigger than FD_SETSIZE. This causes the code to pass random memory into the select() call.

The fix would seem to be to use poll() instead of select() in handle_connections_sockets() in sql/mysqld.cc.
[23 Nov 2009 10:47] Valeriy Kravchuk
I failed to repeat. I've started server, 5.1.42-debug (after setting ulimit -n 40000) as follows:

suse:/home2/openxs/dbs/5.1 # bin/mysqld_safe --user=openxs --innodb_file_per_table --innodb_open_files=20000 &

then I executed the following shell script to create 20000 InnoDB tables:

openxs@suse:/home2/openxs/dbs/5.1> i=0; while [ $i -le 20000 ]; do echo $i; bin/mysql -uroot test -e "create table tii$i(c1 int primary key, c2 int) engine=InnoDB"; let i=i+1; done 

and after it finished killed -9 both mysqld and mysqld_safe. Upon restart I've got only these messages in the error log:

100228 15:36:48 mysqld_safe Starting mysqld daemon with databases from /home2/openxs/dbs/5.1/var
100228 15:36:48 [Note] Plugin 'FEDERATED' is disabled.
100228 15:36:48 [Note] Plugin 'ndbcluster' is disabled.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
100228 15:36:49  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
100228 15:39:02  InnoDB: Started; log sequence number 0 66300224
100228 15:39:03 [Note] Event Scheduler: Loaded 0 events
100228 15:39:03 [Note] /home2/openxs/dbs/5.1/libexec/mysqld: ready for connections.
Version: '5.1.42-debug'  socket: '/tmp/mysql.sock'  port: 3306  Source distribution
[23 Nov 2009 14:42] Sveta Smirnova
Peter,

thank you for the report. Could you please upload full error log file?
[23 Nov 2009 15:13] Kristian Nielsen
It is pretty clear from the code that there is a problem:

pthread_handler_t handle_connections_sockets(void *arg __attribute__((unused)))
...
  uint max_used_connection= (uint) (max(ip_sock,unix_sock)+1);
...
  FD_SET(ip_sock,&clientFDs);
  FD_SET(unix_sock,&clientFDs);
...
  if (select((int) max_used_connection,&readFDs,0,0,0) < 0)
...

There is no check for if ip_sock or unix_sock are >= FD_SETSIZE. So if they
are, the FD_SET() will overflow the buffer, and select() will be passed
undefined stack memory.

The easiest way to reproduce is probably to add something like this at the
start of network_init() (in sql/mysqld.cc):

    for (int i= 0; i < 2000; i++)
      (void)open("/dev/null", O_RDONLY);

This will force the sockets to get file descriptors outside of the range of
FD_SETSIZE (which is 1024 on Linux).
[23 Nov 2009 15:24] Lenz Grimmer
Blog post related to this bug report: http://www.mysqlperformanceblog.com/2009/11/20/rare-evil-mysql-bug/
[24 Nov 2009 7:25] Sveta Smirnova
Kristian,

thank you for the hint! I can not repeat problem on my side though.

Peter,

please also upload your configuration file.
[25 Nov 2009 6:52] Sveta Smirnova
Kristian,

have you tried your patch? I only can see error: "091125  9:39:19 [ERROR] Error in accept: Too many open files" and not original problem.
[25 Nov 2009 7:52] Kristian Nielsen
Sounds like you ran mysqld with a limit on the number of open file handles allowed, which will prevent the problem from showing.

On my system the default is 1024, so you need something like

    ulimit -n 20000

before starting the mysqld.

(But I did not try the patch).
[25 Nov 2009 11:31] Sveta Smirnova
Kristian,

thank you for the update. Problem still not repeatable for me.
[26 Nov 2009 12:32] Andrii Nikitin
From 'man fd_set':
"
An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with a value of fd that is negative or is equal to or larger than FD_SETSIZE will result in undefined behavior. 
"

<Undefined behavior> doesn't guarantee errors will be shown.
Sveta, could you in the testcase just put following assert into handle_connections_sockets before SD_SET:

(ip_sock < FD_SETSIZE && unix_sock < FD_SETSIZE)

and confirm that no invalid values which cause undefined behavior go to FD_SET call?
[27 Nov 2009 9:18] Sveta Smirnova
Andrii,

thank you for the hint.

I modify sql/mysqld.cc as follows:

$bzr diff
=== modified file 'sql/mysqld.cc'
--- sql/mysqld.cc       2009-11-03 00:52:57 +0000
+++ sql/mysqld.cc       2009-11-27 09:13:35 +0000
@@ -5074,6 +5074,7 @@
     for (uint retry=0; retry < MAX_ACCEPT_RETRY; retry++)
     {
       size_socket length=sizeof(struct sockaddr_in);
+         sock=1025;
       new_sock = accept(sock, my_reinterpret_cast(struct sockaddr *) (&cAddr),
                        &length);
 #ifdef __NETWARE__ 

MySQL error log spammed with

...
4893 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4894 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4895 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4896 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4897 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4898 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4899 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4900 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4901 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4902 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4903 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4904 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4905 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4906 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4907 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4908 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4909 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
4910 091127 12:08:39 [ERROR] Error in accept: Bad file descriptor
....

I believe if resource is temporarily not available due to some reason it would repeat trying to use not available resource again and again as it does with this case. So bug can be considered verified although original problem has not been repeated.
[10 Jan 2010 16:00] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/96472

2957 Magne Mahre	2010-01-10
      Bug#48929 Error in Accept() if using many file descriptors
      
      In POSIX systems, the file descriptor set used in the select(2)
      system call is represented by a bit vector, of size FD_SETSIZE.
      When select(2) is used on file/socket descriptors with a value
      that is beyond the this size, unpredictable errors may occur.
      
      In this case, the error happens when there are a large number
      of tables that need repair.  These tables are opened before
      the sockets for incoming connections are acquired, resulting
      in these sockets getting descriptor id which is higher than
      FD_SETSIZE.
      
      Replacing the call to select(2) with poll(2) fixes the problem,
      as poll takes an array of the wanted descriptors, instead of
      a bit vector.  
      
      MS Windows has a different implementation of 'select', and is not
      affected by this bug.  
     @ configure.in
        Added a test for the <poll.h> file
     @ sql/mysqld.cc
        Restructured some of the code to reduce the number of
        #ifdef's.  Removed some HP/UX 10-specific code.
[3 Feb 2010 22:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/99173

2981 Magne Mahre	2010-02-03
      Bug#48929 Error in Accept() if using many file descriptors
      
      In POSIX systems, the file descriptor set used in the select(2)
      system call is represented by a bit vector of size FD_SETSIZE.
      When select(2) is used on file/socket descriptors with a value
      that is beyond this size, unpredictable errors may occur.
      
      In this case, the error happens when there are a large number
      of tables that need repair.  These tables are opened before
      the sockets for incoming connections are acquired, resulting
      in these sockets getting descriptor id which is higher than
      FD_SETSIZE.
      
      Replacing the call to select(2) with poll(2) fixes the problem,
      as poll takes an array of the wanted descriptors, instead of
      a bit vector.  
      
      MS Windows has a different implementation of 'select', and is not
      affected by this bug.  
     @ configure.in
        Added a test for the <poll.h> file
     @ sql/mysqld.cc
        Restructured some of the code to reduce the number of
        #ifdef's.  Removed some HP/UX 10-specific code.
[14 Feb 2010 18:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/100308

2962 Magne Mahre	2010-02-14
      Bug#48929 Error in Accept() if using many file descriptors
      
      In POSIX systems, the file descriptor set used in the select(2)
      system call is represented by a bit vector of size FD_SETSIZE.
      When select(2) is used on file/socket descriptors with a value
      that is beyond this size, unpredictable errors may occur.
      
      In this case, the error happens when there are a large number
      of tables that need repair.  These tables are opened before
      the sockets for incoming connections are acquired, resulting
      in these sockets getting descriptor id which is higher than
      FD_SETSIZE.
      
      Replacing the call to select(2) with poll(2) fixes the problem,
      as poll takes an array of the wanted descriptors, instead of
      a bit vector.  
      
      MS Windows has a different implementation of 'select', and is not
      affected by this bug.  
     @ configure.in
        Added a test for the <poll.h> file
     @ sql/mysqld.cc
        Restructured some of the code to reduce the number of
        #ifdef's.  Removed some HP/UX 10-specific code.
[25 Feb 2010 19:47] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100225194420-p60r4u90tszj8q2x) (version source revid:magne.mahre@sun.com-20100214183042-8kst82v1rm3887e6) (merge vers: 6.0.14-alpha) (pib:16)
[25 Feb 2010 19:49] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100225194305-h49uyjrlfl3mwo60) (version source revid:magne.mahre@sun.com-20100214181258-n0wu2x683akayt3h) (pib:16)
[6 Mar 2010 10:58] Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:alik@sun.com-20100225195857-farb6yvy8x06bylj) (merge vers: 5.5.99-m3) (pib:16)
[7 Mar 2010 20:07] Paul Dubois
Noted in 5.5.3, 6.0.14 changelogs.

On POSIX systems, calls to select() with a file descriptor set larger
than FD_SETSIZE resulted in unpredictable I/O errors; for example,
when a large number of tables required repair.