Bug #34303 Client using NDBAPI crashes if client application uses many connections
Submitted: 5 Feb 2008 11:16 Modified: 2 Jun 2010 14:40
Reporter: Lukasz Osipiuk Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.3 OS:Linux
Assigned to: Magnus Blåudd CPU Architecture:Any
Tags: 5.1.23-ndb-6.3.7, Contribution

[5 Feb 2008 11:16] Lukasz Osipiuk
Description:
Description copied from mailinglist post from thread:
http://lists.mysql.com/internals/35283

We are using libndbclient.so (telco branch: 5.1.23-ndb-6.3.7 compiled
under Linux 2.6 amd64) in the application which creates many file
descriptors (i.e TCP connections).  We are experiencing crashes of
client application (SIGSEGV with stack corruption) if shutdown one of
cluster nodes (leaving a cluster, as a whole, operational).

After analysis of libndbclient code we found out that it uses select
in the loop which tries to reestablish connection with shutdown node.
(e.g. code in storage/ndb/src/common/util/SocketClient.cpp)

As soon as number of used file descriptors in our application exceeds
1024, Cluster client library starts to make invalid calls to select()
function. The file descriptor it gets from socket() call is >= than
1024 (FD_SETSIZE), and calling select with such fd crashes client app
(doc: http://linux.die.net/man/2/select).

How to repeat:
Start a NDB cluster with one node down.
Start a client application which 
* connects to the cluster
* opens lots (>1024) sockets/files

Application should crash soon.
[5 Feb 2008 11:17] Lukasz Osipiuk
dirty patch. use poll() instead of select() if HAVE_POLL is defined.

Attachment: poll.patch (text/x-patch), 18.86 KiB.

[8 Jun 2009 22:55] liz drachnik
Hello Lukasz - 

In order for us to continue the process of reviewing your contribution to MySQL - We need you to review and sign the Sun|MySQL contributor agreement (the "SCA")

The process is explained here: 
http://forge.mysql.com/wiki/Sun_Contributor_Agreement

Getting a signed/approved SCA on file will help us facilitate your contribution-- this one, and others in the future.

Thank you ! 

Liz Drachnik  - Program Manager - MySQL
[2 Oct 2009 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[1 Jun 2010 10:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109696
[1 Jun 2010 12:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109730
[1 Jun 2010 13:26] Bugs System
Pushed into 5.1.44-ndb-7.0.16 (revid:magnus.blaudd@sun.com-20100601131454-0ah9xwr9dz8tu9vv) (version source revid:magnus.blaudd@sun.com-20100601125721-e2zldsuiucpj89w3) (merge vers: 5.1.44-ndb-7.0.16) (pib:16)
[2 Jun 2010 9:38] Magnus Blåudd
Pushed to 6.3.35, 7.0.16 and 7.1.5

A patch which prefer to use 'poll' in favour of 'select' to avoid problem when using fd's with numbers higher than FD_SETSIZE, the patch also fixes a problem on Windows when using more than 64 sockets.
[2 Jun 2010 10:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109921

3208 Martin Skold	2010-06-02 [merge]
      Merge
      added:
        storage/ndb/include/portlib/ndb_socket_poller.h
      modified:
        mysql-test/t/ctype_cp932_binlog_stm.test
        storage/ndb/include/portlib/NdbTCP.h
        storage/ndb/include/transporter/TransporterRegistry.hpp
        storage/ndb/include/util/SocketClient.hpp
        storage/ndb/include/util/SocketServer.hpp
        storage/ndb/src/common/transporter/TCP_Transporter.cpp
        storage/ndb/src/common/transporter/TCP_Transporter.hpp
        storage/ndb/src/common/transporter/Transporter.cpp
        storage/ndb/src/common/transporter/TransporterRegistry.cpp
        storage/ndb/src/common/util/SocketClient.cpp
        storage/ndb/src/common/util/SocketServer.cpp
        storage/ndb/src/common/util/socket_io.cpp
        storage/ndb/src/kernel/blocks/dbtup/DbtupTrigger.cpp
        storage/ndb/src/kernel/vm/SimulatedBlock.cpp
        storage/ndb/src/kernel/vm/VMSignal.hpp
        storage/ndb/src/mgmapi/mgmapi.cpp
[2 Jun 2010 14:40] Jon Stephens
Documented bugfix in the NDB-6.3.35, 7.0.16, and 7.1.5 changelogs as follows:

      An excessive number of client connections, such that more than 1024 
      file descriptors, sockets, or both were open, caused NDB API 
      applications to crash.

Closed.