Bug #36194 Eliminate pinging with connection pooling to get massive performance improvement
Submitted: 18 Apr 2008 3:15 Modified: 11 Nov 11:08
Reporter: Maxim Mass
Status: To be fixed later
Category:Connector/Net Severity:S5 (Performance)
Version: OS:Any
Assigned to: Target Version:
Triage: D2 (Serious)

[18 Apr 2008 3:15] Maxim Mass
Description:
When connection pooling is enabled, the connector performs a full round-trip ping to the
server for every request. This is done to ensure that the server is still running and
that the checked out driver is still able to communicate with the server. Unfortunately
this ping often times doubles the total query execution time  for simple or query-cached
queries.

How to repeat:
A simple app that hits a database in a loop with connection pooling enabled.
Ensure that mysql.data.dll is compiled with symbols and attach a profiler. You'll see
that pinging takes significant % of the query time.

Suggested fix:
Remove pinging and instead handle fatal socket exceptions in pooled connections by
removing the connection from the pool and retrying on a new connection. If the socket
exception still happens then bubble it out to the client.

I will attach a patch that we've been using successfully with MindTouch Deki Wiki for the
past 2 months shortly.
[18 Apr 2008 20:24] Maxim Mass
Patch for 5.1.4 to safely remove pinging

Attachment: mysql.data.dll 5.14 no pinging with connection pool.txt (text/plain), 6.11 KiB.

[18 Apr 2008 20:30] Maxim Mass
Please note that the attached patch may depend on bug #34001. There may be a cleaner way
to accomplish this but this approach has worked for us and our users without issue. You
can grab our connector with this and other patches from
https://svn.mindtouch.com/source/public/dekiwiki/trunk/src/redist/MySql.Data.dll to
compare performance and stability.

Thank you,
Max
[21 Apr 2008 10:27] Tonci Grgin
Hi Maxim and thanks for (both) report(s).

I can verify them with no problem but:
> Remove pinging and instead handle fatal socket exceptions in pooled connections by
removing the connection from the pool and retrying on a new connection. If the socket
exception still happens then bubble it out to the client.

AFAIK, socket exception will take some time to raise and we might hit some other problems
doing things this way. What if load-balancing comes in picture? What if you have mess in
host table and so on... This is just from the top of my head...

Let's see what Reggie has to say about this problem.
[21 Apr 2008 21:29] Maxim Mass
Tonci, thanks for the response and I do look forward to hearing Reggie's thoughts on this
as well. I do understand your concerns as this is a relatively fundamental change but I
do believe the performance gains warrant seeing this through to a working solution. 

I am fairly confident that removing this ping and replacing it with better error handling
will not impact how people use connection pooling as the behavior and intent is still the
same. Although I haven't tried the scenario, I don't believe that setups with a load
balancer between the connector and the server would be affected. The connector is
behaving the same way except that instead of trying its pinging to see if a connection
from the pool is valid, it assumes it's valid. AFAIK, the ping process has the same
potential long timeout and other misc socket issues as instead just trying to send the
query in the first place.

Microsoft's SQL server connector seems to behave this way as well but they took the
additional step of doing background health checking of their pool.

From: http://msdn2.microsoft.com/en-us/library/8xx3tyca.aspx
"If a connection exists to a server that has disappeared, this connection can be drawn
from the pool even if the connection pooler has not detected the severed connection and
marked it as invalid. This is the case because the overhead of checking that the
connection is still valid would eliminate the benefits of having a pooler by causing
another round trip to the server to occur. When this occurs, the first attempt to use the
connection will detect that the connection has been severed, and an exception is thrown."

Again, looking forward to hearing your thoughts on this.
[21 Apr 2008 22:09] Reggie Burnett
Maxim

Thanks for the patch!  I had already planned to do something very similar for the 5.2.2
beta.  Having your patch just makes it easier!
[21 Apr 2008 22:59] Reggie Burnett
It's not quite as simple as this patch.  There are some scenarios that must be considered
such as the following:

SetDatabase does not use the ExecuteXXX methods and is the first query called on a newly
returned pooled connection so if the pooler returns a busted connection this will bomb.

Second, it's important that a connection that is inside a current transaction not
silently continue with a re-established connection because the user will think that
nothing happened when in fact their transaction is now gone.

The better solution is to identify the few points inside NativeDriver.cs where a broken
connection can be safely re-established and do that.  THis will work for pooled and
non-pooled connections (which is a plus).  This change is going into 5.2.2 beta
[22 Apr 2008 10:10] Tonci Grgin
Reggie, "To be fixed later" is ok?
[11 Jun 2008 4:33] Poul Bak
Most public avaiable webhosts has limited or even disabled 'Ping from Wan'. This includes
MySql servers.
Mine has *SIGH*. Disabling pooling helped, but an alternative to ping would be better.
[17 Jun 2008 23:03] Poul Bak
Ok, I have solved my problems, they were actually not related to ping!
[13 Nov 2008 2:00] d di
Comments from a Connector/NET user:
 - option to allow disable of ping() on pool fetch: sounds good (we do our own ping
anyway)
 - crashing later like Microsoft does: sounds good (we need to handle crashes anyway)
 - automatically issuing queries on another connection if the current one fails: GOD NO!
PLEASE DON'T BREAK STUFF LIKE THAT.

I'm not going to go into all the issues with that last point.  I'll just mention that
enterprise applications that are dependent on a specific response time (including
connection and command timeouts) are going to break horribly.

Okay, just one more, anything that depends on temporary tables, user variables, or
whatever state will break too.

Anyway, it's something you can do in your own connector class that inherits from the main
Connector/NET, for your own purposes.  You can publish your wrapper class too if you'd
like.  But it's very much a feature you do not want in a stable consumer product.

.2€
[15 Nov 2008 22:02] d di
More issues with automatic re-issue on different connection:
 - transactions break when part of a transaction is re-executed on a new, auto-commit
connection.
 - timeouts are currently implemented client side, meaning UPDATEs, INSERTs etc would
often be executed twice or more.

Anyway, enough about that.  The problems mentioned by Maxim are real enough.  Time to be
constructive for a brief moment :-).

I've tried to incorporate solutions for all of these problems in a new piece of
connection pooling logic.  Shameless plug: can be found attached to issue #40684.
[18 Nov 2008 0:29] Maxim Mass
Hey David, I'm glad there's renewed interest in this issue. I haven't put any time or
thought into it since filing this bug 6 months ago. We've been running and distributing a
custom build of the connector with our software (open source MindTouch Deki) with this and
other patches and haven't had any problems reported with this approach of removing pinging
and handling errors by retrying on a new connection. I can see how this may not be a full
solution to the problem but it addressed our needs. We don't use transactions or assume
that multiple queries must be executed using the same connection driver. 
This patch likely didn't address all the issues relating to retrying a query on a new
connection but it did further abstract the client code from the connection pooling
implementation. It looks like you did a great job of capturing all the connection pooling
issues in bug #40684 and I'd be very happy to see it integrated in. Once I get some time
I'll take a closer look at it and provide some feedback.