Bug #37104 Synchronization issues on connection under high volume
Submitted: 30 May 2008 16:12 Modified: 1 Sep 22:10
Reporter: Christos Pavlides
Status: Closed
Category:Connector/Net Severity:S1 (Critical)
Version:5.3.0.0 OS:Any
Assigned to: Target Version:
Triage: D2 (Serious)

[30 May 2008 16:12] Christos Pavlides
Description:
In StoredProcedure.cs the Close function is doing some tricks on the connection string
properties to allow for output parameters. Although I am not 100% sure I think if you are
using connection pooling, the same connection (with the same connection string params) is
part of the connection pool, which can get released at any time between the end of the
using statement (using (MySqlDataReader reader = cmd.ExecuteReader())) and the finally.
Although the possibility of this happening is low, under heavy load (100+ threads hitting
the database in my case) this can happen and it will lead to errors as described in the
repeat steps.
                
P.S. I consider this a very critical issue since if it happens while calling a main
procedure with transactions it will cause the system to stop functioning until the lock
is released. By the way if an exception is raised in the connector as in this case
shouldn't the transaction be rolled back or at least kill the connection to force the db
to roll it back? This is actually a more serious issue...

How to repeat:
Under a recent volume test on our application server I have noticed that under certain
circumstances the database server would start throwing errors about locked tables ('Lock
wait timeout exceeded; try restarting transaction'). In order to recover I had to kill
all the connections or restart the database server and application server. (Probably
killing the right connection would also solve the problem, but anyway)

In our error log just before the locks, there was always a strange error about a
"Parameter xxx must be defined.". My guess was that something was happening on the
connection that was leaving it in an ustable state. 
The description explains what this might be. My assumption is that during the time that
it takes from the end of the using to the finally where the parameter is reset to the
original value, another procedure comes in that tries to get its outgoing parameters. At
that point the parameter is reset in the finally and the ougoing parameters of the
current procedure cannot be read.    
The place where the output parameters are checked is:
protected virtual bool ShouldIgnoreMissingParameter(string parameterName)
        {
            return Connection.Settings.AllowUserVariables ||
                (parameterName.Length > 1 &&
                (parameterName[1] == '`' || parameterName[1] == '\''));
        }
(In the Statement.cs file)

and the function that I think has a problem is:

public override void Close()
		{
            base.Close();

			if (outSelect.Length == 0) return;

            bool allowUserVar = Connection.Settings.AllowUserVariables;
            Connection.Settings.AllowUserVariables = true;
            try
            {
                MySqlCommand cmd = new MySqlCommand("SELECT " + outSelect, Connection);
                using (MySqlDataReader reader = cmd.ExecuteReader())
                {
                    // since MySQL likes to return user variables as strings
                    // we reset the types of the readers internal value objects
                    // this will allow those value objects to parse the string based
                    // return values
                    for (int i = 0; i < reader.FieldCount; i++)
                    {
                        string fieldName = reader.GetName(i);
                        fieldName = fieldName.Remove(0, hash.Length + 1);
                        MySqlParameter parameter =
Parameters.GetParameterFlexible(fieldName, true);
                        reader.values[i] =
MySqlField.GetIMySqlValue(parameter.MySqlDbType);
                    }

                    if (reader.Read())
                    {
                        for (int i = 0; i < reader.FieldCount; i++)
                        {
                            string fieldName = reader.GetName(i);
                            fieldName = fieldName.Remove(0, hash.Length + 1);
                            MySqlParameter parameter =
Parameters.GetParameterFlexible(fieldName, true);
                            parameter.Value = reader.GetValue(i);
                        }
                    }
                }
            }
            finally
            {
                Connection.Settings.AllowUserVariables = allowUserVar;
            }
		}

(in the StoredProcedure.cs).

Stack Trace of the initial exception:

MySql.Data.MySqlClient.MySqlException: Parameter '@2144714040velocitycheckvalue' must be
defined.
   at MySql.Data.MySqlClient.Statement.SerializeParameter(MySqlParameterCollection
parameters, MySqlPacket packet, String parmName) in
G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\Statement.cs:line
240
   at MySql.Data.MySqlClient.Statement.InternalBindParameters(String sql,
MySqlParameterCollection parameters, MySqlPacket packet) in
G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\Statement.cs:line
182
   at MySql.Data.MySqlClient.Statement.BindParameters() in
G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\Statement.cs:line
105
   at MySql.Data.MySqlClient.Statement.Execute() in
G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\Statement.cs:line
82
   at MySql.Data.MySqlClient.PreparableStatement.Execute() in
G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\PreparableStatement.cs:line
120
   at MySql.Data.MySqlClient.MySqlCommand.ExecuteReader(CommandBehavior behavior) in
G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\command.cs:line
431
   at MySql.Data.MySqlClient.MySqlCommand.ExecuteReader() in
G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\command.cs:line
330
   at MySql.Data.MySqlClient.StoredProcedure.Close() in
G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\StoredProcedure.cs:line
222
   at MySql.Data.MySqlClient.MySqlCommand.Close() in
G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\command.cs:line
319
   at MySql.Data.MySqlClient.MySqlDataReader.Close() in
G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\datareader.cs:line
182
   at MySql.Data.MySqlClient.MySqlCommand.ExecuteNonQuery() in
G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\command.cs:line
309
   at SubSonic.MySqlDataProvider.ExecuteQuery(QueryCommand qry) in 

Suggested fix:
If my assumptions are correct my suggestion is to never change a shared object such as
the connection, but use a property on the command which is almost always local to the
statement that runs at any given point. This property has to read its initial value from
the connection string, except this case which should always be true.
[30 May 2008 16:19] Tonci Grgin
Hi Christos and thanks for your report.

Can you please do the following:
  o) attach small, self-sufficient test case I can use to repeat the problem
  o) provide small database dump with all necessary data needed for testing
  o) add info on environment, MySQL server ...
[30 May 2008 16:41] Christos Pavlides
Hi Tonci,
I am using MySql 5.1.24 and InnoDb tables, and calling a stored procedure that does a lot
of transactional work and has two out params. The calling application is a multithreaded
application server that might have a large number of threads trying to call the same
procedure.
I would be happy to provide you with any other information that you might need, but in
this case it is quite difficult (due to the complexity of systems). The good news are
though that I do not think there is a reason to send you the actual data since it is
really irrelevant to the subject. In addition it would be easier to replicate the issue
if you put a Sleep call just after the using statement and send a few transactions that
will try to use the same connection.
This will delay the reseting of the connection parameter and will make it clear where the
problem is.
I hope I provided you with enough information to investigate this further. If you really
need an application to replicate this and/or actual db data tell me and I will try to do
my best.
[30 May 2008 17:55] Christos Pavlides
Hi Tonci,
if its of any help I tried my suggestion to use a private property on the command object
instead of changing the connection parameter and from initial tests it seems that it is
working fine. I do not get the error: parameter xxx must be defined. But I will have to
do more tests to verify it. It would be very helpful if you could investigate my
assumptions and comment on them (If I am on the right path, or totally of topic...).
I have created a patch, but I did it in a hurry, just to test my theory, so it might not
be the best way to do it.
[30 May 2008 17:56] Christos Pavlides
My Patch

Attachment: SynchIssue.patch (text/plain), 3.09 KiB.

[30 May 2008 21:06] Reggie Burnett
Christos

I don't follow what you are saying here:

"Although I am not 100% sure I think if you are
using connection pooling, the same connection (with the same connection string params) is
part of the connection pool, which can get released at any time between the end of the
using statement (using (MySqlDataReader reader = cmd.ExecuteReader())) and the finally."

The connection should not be released back to the pool until you close the connectoin and
that should not happen until you have closed the reader which will close the command so I
don't see how the connection can be released back to the pool prior to
storedprocedure.close() can be called.
[2 Jun 2008 10:21] Christos Pavlides
Hi Reggie,
you are absolutely correct, after analyzing the code a little better it seems that both
the actual SP call and the retrieval of the output parameters happen on the same
connection which is not closed, therefore my initial analysis was totally off. 
The problem now is why then am I getting the errors about "the parameter must be
defined"? Also after I made that patch I am not getting that error any more, although due
to the nature of multithreading applications this might be just pure luck (or lack of..).

Anyway I think the only way this error can happen in my situation is if the
Connection.Settings.AllowUserVariables is set to false when it should be true. And the
only way this can happen is if two threads change the same connection. I am going to
guess again here (please forgive me) and say that if it is impossible to happen at the
retrieval of the output parameters (.Close on an SP) then it might be happening when the
initial connection is retrieved from the pool (which might be possible since I didn't
anywhere a lock on the connection pool when the connection is retrieved).
Please investigate this some more, I am doing the same on my side, trying to find an
easier way to replicate this.

Thanks Tonci and Reggie I really appreciate all your help with this.
[2 Jun 2008 12:29] Tonci Grgin
Christos, I am not sure what to do with this report now... Things might be thread-safe but
necessarily shareable across threads (which is what I presume is happening). Close?
[2 Jun 2008 13:35] Christos Pavlides
Hi Tonci,
well I do not understand exactly what do you mean, and maybe I did a little too much
guestimating myself on this which put you on the wrong tracks. 
The main point of this issue is that without my patch the connector fails with the error
about the parameter must be defined under some load. This means that a thread is changing
the connection object that is being used by another thread, which shouldn't be happening.
We have to fix this, but to properly understand where this is happening, I need some help
and guidance from you guys.
As you say some things are common and shareable between threads, and I totally agree with
you. But when we have "shared" objects we have to pay some special attention when
modifying these objects in order make sure that two threads modifying the same object do
so in turns. In this case there might be some situations where two threads try to get a
connection and they get the same one. If this is a valid case then we have to find a way
to prevent this. 

Basically what I am trying to say here is that I am 90% sure that there is a thread
synchronization issue with the driver, but these kind of issues are very hard to track
down since in order to simulate the scenario you have to run a multithreaded application
and be able to track what is going on with the shared objects. What I am asking is, since
you guys wrote the connector and support it you have more knowledge about what these
shared objects are, where they are used and how they should be used. Although I can try
to find the problem myself, and I am trying for the last couple of days it is consirable
more difficult for me since I have to dig into the code and understand why certain
operations are being done and why.

I wrote too much again, I hope you understood my point. If you can help me out on this I
would really appreciate it, if not I will try to figure it on my one and if I do find a
solution I will post a patch.

Thanks again
[2 Jun 2008 13:46] Tonci Grgin
Christos, I get the point. It will be hard to pin this one down but will try to make test
case...
[2 Jun 2008 16:08] Christos Pavlides
Continuing on my quest to find the weirdness in my problem I hit another exception of the
connector:

System.NullReferenceException: Object reference not set to an instance of an object.
   at MySql.Data.MySqlClient.MySqlPool.CheckoutConnection() in
...\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\MySqlPool.cs:line 130
   at MySql.Data.MySqlClient.MySqlPool.GetPooledConnection() in
...\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\MySqlPool.cs:line 158
   at MySql.Data.MySqlClient.MySqlPool.TryToGetDriver() in
...\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\MySqlPool.cs:line 238
   at MySql.Data.MySqlClient.MySqlPool.GetConnection() in
...\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\MySqlPool.cs:line 251
   at MySql.Data.MySqlClient.MySqlConnection.Open() in
...\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\Connection.cs:line 476
[3 Jun 2008 14:51] Christos Pavlides
Ok guys, this is my final comment on this... I believe I found the place where this is
happening. I reviewed this particular part of the code before and I keep coming back to
it, because I believe it is not thread safe and it explains why I am getting these
exceptions.
The CheckoutConnection() in MySqlPool is doing this:
Driver driver = (Driver)idlePool.Dequeue();
Without any locks anywhere, even from the MSDN docs it says: A Queue<(Of <(T>)>) can
support multiple readers concurrently, as long as the collection is not modified. Even
so, enumerating through a collection is intrinsically not a thread-safe procedure. To
guarantee thread safety during enumeration, you can lock the collection during the entire
enumeration. To allow the collection to be accessed by multiple threads for reading and
writing, you must implement your own synchronization.

Therefore I think that by not properly locking the queues holding the connections there
is a possibility that two threads will get the same driver (my first issue with the
parameter not found thingy), there is also a possibily that some threads might fail to
get a driver since the check on the count of the queue happens in another function before
the CheckoutConnection() (my second type of exceptions where the driver dequeued is null).
From the time the counter is checked to the time the queue is dequeued there a posibility
that another thread will dequeue the last remaining threads which will result in the null
expection I got a couple of times.

Anyway I would appreciate your help with this. I will try to create a simple test that
will demonstrate this.
[3 Jun 2008 16:41] Reggie Burnett
I just emailed you our latest mysqlpool.cs which had that issue fixed recently.  Can you
test it (without your other patch) and tell me if it fixes your issue?
[4 Jun 2008 14:45] Christos Pavlides
Ok guys I think I found the reason for this wierdness... After days of testing and trying
to simulate the problem using multithreading applications unit tests etc, I realised that
the MySqlConnection has a property ConnectionString which uses caching to cache previous
settings. When a new connection is created and the settings are found in the cache then
the cached settings are set to the current connection settings:
settings = newsettings;
which is the the main cause of the connector misbehaving. The two variables are objects
which are copied by reference which means that when you open 100 connections and you
change the setting.AllowUserVariables then it will change for all connections, which is
what is causing some of the problems in my case. (The other half of my problems are
solved using Reggie's changes by locking the pools.) 

Therefore I have two posible solutions:
1) Make the MySqlConnectionStringBuilder Clonable and instead of settings = newsettings
do settings = newsettings.Clone(), but since this is a huge object it will take forever.
2) Use my initial patch and make sure the settings of a connection are never changed
directly (like the AllowUserVariables).

Please comment on my findings. 

Thanks for all your help
[4 Jun 2008 15:31] Tonci Grgin
Christos, thank you very much for your efforts and analysis. I do remember this was a
problem before too, when we had ever-growing connection string. I believe there is a good
reason for this implementation and will leave final ruling to Reggie.

Thanks again for your contribution and your interest in MySQL!
[6 Jun 2008 20:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/47545
[6 Jun 2008 20:21] Reggie Burnett
Can you check this patch on your code and see if it fixes the issue.  I haven't reproduced
the problem here.
[9 Jun 2008 8:10] Christos Pavlides
On first glance it seems to work. Thanks. But I will try to run my test cases to see if we
have missed something. 
By the way I am not sure if you received my e-mails but it seems there is one small
problem related to threading. This time it is a thread sync issue. If you want me to open
a new issue I will, but here is the short story:
GetDefaultCollation and GetMaxLength call the db to get a set of parameters and cache
them in two static dictionaries in function InitCollections. These functions are not
thread safe since if many threads try to call a procedure for the first time they will
all try to insert the same keys in these collections and a lot of duplicate key
exceptions will be thrown.
[13 Jun 2008 10:08] Christos Pavlides
Hi Reggie,
the fix seems to be working properly, but did you check the other issue I mentioned on my
last comment?
[18 Jun 2008 8:31] Christos Pavlides
Hi Reggie/Tonci,
I can see that the status of this issue is still "patch pending". When will this be
implemented? Also can you tell me if you are going to look at the other issue I mention
with the GetDefaultColation? or should I open a new issue for this?

Thanks again
[22 Oct 2008 2:03] Jared Sullivan
Hi Christos,  Open up another bug regarding thread safe and link to this report.
[22 Oct 2008 8:29] Christos Pavlides
What do you mean?
[22 Oct 2008 8:33] Jared Sullivan
You mentioned that GetDefaultColation proc is not thread safe and may be another bug in
its own right.
[22 Oct 2008 8:43] Tonci Grgin
Christos, two things:
  o Jared is right in that two problems should not be reported in one bug report.
  o I will ping Reggie regarding status of patch.
[22 Oct 2008 9:32] Christos Pavlides
ok opened a new issue Bug #40231
[1 Sep 22:10] Reggie Burnett
This patch was pushed to 5.2 months ago.  Not sure why this got stuck in patch pending
status