Bug #7928 out of connection objects
Submitted: 15 Jan 2005 15:25 Modified: 20 Jan 2005 11:45
Reporter: frank homann Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:4.1.9 OS:Linux (linux-2.6.10)
Assigned to: Jonas Oreland CPU Architecture:Any

[15 Jan 2005 15:25] frank homann
Description:
we are testing a 3 nodes mysql cluster:

Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2 (not connected, accepting connect from 10.0.91.119)
id=3    @10.0.91.120  (Version: 4.1.9, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.91.118  (Version: 4.1.9)

[mysqld(API)]   3 node(s)
id=4    @10.0.91.120  (Version: 4.1.9)
id=5    @10.0.91.119  (Version: 4.1.9)
id=6    @10.0.91.118  (Version: 4.1.9)

with a phpadsnew installation and after ~300k banner impressions the cluster stops working with this error:

ERROR 1297 (HY000): Got temporary error 4006 'Connect failure - out of connection objects (increase MaxNoOfConcurrentTransactions)' from ndbcluster

and it's recoverable only by restarting the ndbd nodes...

during the tests there are ~400 queries/second on every node and the servers load keeps below 5.

we've tried increasing both MaxNoOfConcurrentOperations and MaxNoOfConcurrentTransactions until 1258291 and 157286 respectively (above these values ndbd doesn't come up)

as suggested from Mikael Ronström (mikael@mysql.com) we have even tried to run ndbd with the --skip-ndb-optimized-node-selection command line option but with the same result...

even setting TransactionInactiveTimeout = 5000 doesn't solve the issue...

thank you!

How to repeat:
install phpadsnew with a mysql cluster as backend and stress it with ApacheBench

configuration:

[NDBD DEFAULT]
NoOfReplicas = 2
[MYSQLD DEFAULT]
[NDB_MGMD DEFAULT]
[TCP DEFAULT]
# Managment Server
[NDB_MGMD]
HostName = 10.0.91.118            # the IP of THIS SERVER
# Storage Engines
[NDBD]
HostName = 10.0.91.119            # the IP of the FIRST SERVER
DataDir = /var/lib/mysql
MaxNoOfConcurrentOperations = 1258291
MaxNoOfConcurrentTransactions = 157286
TransactionInactiveTimeout = 5000
[NDBD]
HostName = 10.0.91.120            # the IP of the SECOND SERVER
DataDir = /var/lib/mysql
MaxNoOfConcurrentOperations = 1258291
MaxNoOfConcurrentTransactions = 157286
TransactionInactiveTimeout = 5000
# 3 MySQL Clients
[MYSQLD]
[MYSQLD]
[MYSQLD]
[15 Jan 2005 15:31] frank homann
sorry, the above submitted ndb_mgm output shows a not connected ndbd node but i always verify to get a output like this:

Connected to Management Server at: 10.0.91.118:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @10.0.91.119  (Version: 4.1.9, Nodegroup: 0, Master)
id=3    @10.0.91.120  (Version: 4.1.9, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.91.118  (Version: 4.1.9)

[mysqld(API)]   3 node(s)
id=4    @10.0.91.118  (Version: 4.1.9)
id=5    @10.0.91.119  (Version: 4.1.9)
id=6    @10.0.91.120  (Version: 4.1.9)

before starting any test...
[17 Jan 2005 20:56] Jonas Oreland
Hi,

Could you please provide the queries together with the data the brings this bug forward?

Or is it possible to login to your site an look for the error there?
[17 Jan 2005 21:01] Martin Skold
Very little info to go on, not really a
reproducable test case. Until we have
that we will try and do a code analysis
to see if their is any possible scenario
where a correctly behaving application
can encounter a resource leak.

Note that a misbehaving application
that does not explicitly commit or abort
transactions will leave them allocated
and might use up all connection resources.
[20 Jan 2005 10:38] Jonas Oreland
I got a testcase from a different bug report.
[20 Jan 2005 10:51] frank homann
hi!

sorry for letting you waiting...

i can't give you an account on our cluster servers because they've gone production (yes, our clients are very hurry)
if you want i can recreate the testing ambient on a couple of spare servers and give you an account on them...
i noticed that the bug has changed the status from "need feedback" to "patch pending", so should i assume that you already found the problem about this issue?

best regards
[20 Jan 2005 11:45] Jonas Oreland
Yes I found and fixed the problem (at least one with similar symptoms)
ChangeSet@1.2148.1.1

It would be nice if you also could retest.
And if it doesn't work, please add a testcase.

/Jonas
[20 Jan 2005 11:49] frank homann
sorry for the probably dumb question, but where can i find the patch?
[18 Nov 2005 5:52] jun wu
i got the similar bug in mysql cluster5.0.15,i think this is basic action,but can't used.