Bug #37104 Synchronization issues on connection under high volume
Submitted: 30 May 2008 14:12 Modified: 1 Sep 2009 20:10
Reporter: Christos Pavlides Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / NET Severity:S1 (Critical)
Version:5.3.0.0 OS:Any
Assigned to: CPU Architecture:Any
Triage: D2 (Serious)

[30 May 2008 14:12] Christos Pavlides
Description:
In StoredProcedure.cs the Close function is doing some tricks on the connection string properties to allow for output parameters. Although I am not 100% sure I think if you are using connection pooling, the same connection (with the same connection string params) is part of the connection pool, which can get released at any time between the end of the using statement (using (MySqlDataReader reader = cmd.ExecuteReader())) and the finally. Although the possibility of this happening is low, under heavy load (100+ threads hitting the database in my case) this can happen and it will lead to errors as described in the repeat steps.
                
P.S. I consider this a very critical issue since if it happens while calling a main procedure with transactions it will cause the system to stop functioning until the lock is released. By the way if an exception is raised in the connector as in this case shouldn't the transaction be rolled back or at least kill the connection to force the db to roll it back? This is actually a more serious issue...

How to repeat:
Under a recent volume test on our application server I have noticed that under certain circumstances the database server would start throwing errors about locked tables ('Lock wait timeout exceeded; try restarting transaction'). In order to recover I had to kill all the connections or restart the database server and application server. (Probably killing the right connection would also solve the problem, but anyway)

In our error log just before the locks, there was always a strange error about a "Parameter xxx must be defined.". My guess was that something was happening on the connection that was leaving it in an ustable state. 
The description explains what this might be. My assumption is that during the time that it takes from the end of the using to the finally where the parameter is reset to the original value, another procedure comes in that tries to get its outgoing parameters. At that point the parameter is reset in the finally and the ougoing parameters of the current procedure cannot be read.    
The place where the output parameters are checked is:
protected virtual bool ShouldIgnoreMissingParameter(string parameterName)
        {
            return Connection.Settings.AllowUserVariables ||
                (parameterName.Length > 1 &&
                (parameterName[1] == '`' || parameterName[1] == '\''));
        }
(In the Statement.cs file)

and the function that I think has a problem is:

public override void Close()
		{
            base.Close();

			if (outSelect.Length == 0) return;

            bool allowUserVar = Connection.Settings.AllowUserVariables;
            Connection.Settings.AllowUserVariables = true;
            try
            {
                MySqlCommand cmd = new MySqlCommand("SELECT " + outSelect, Connection);
                using (MySqlDataReader reader = cmd.ExecuteReader())
                {
                    // since MySQL likes to return user variables as strings
                    // we reset the types of the readers internal value objects
                    // this will allow those value objects to parse the string based
                    // return values
                    for (int i = 0; i < reader.FieldCount; i++)
                    {
                        string fieldName = reader.GetName(i);
                        fieldName = fieldName.Remove(0, hash.Length + 1);
                        MySqlParameter parameter = Parameters.GetParameterFlexible(fieldName, true);
                        reader.values[i] = MySqlField.GetIMySqlValue(parameter.MySqlDbType);
                    }

                    if (reader.Read())
                    {
                        for (int i = 0; i < reader.FieldCount; i++)
                        {
                            string fieldName = reader.GetName(i);
                            fieldName = fieldName.Remove(0, hash.Length + 1);
                            MySqlParameter parameter = Parameters.GetParameterFlexible(fieldName, true);
                            parameter.Value = reader.GetValue(i);
                        }
                    }
                }
            }
            finally
            {
                Connection.Settings.AllowUserVariables = allowUserVar;
            }
		}

(in the StoredProcedure.cs).

Stack Trace of the initial exception:

MySql.Data.MySqlClient.MySqlException: Parameter '@2144714040velocitycheckvalue' must be defined.
   at MySql.Data.MySqlClient.Statement.SerializeParameter(MySqlParameterCollection parameters, MySqlPacket packet, String parmName) in G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\Statement.cs:line 240
   at MySql.Data.MySqlClient.Statement.InternalBindParameters(String sql, MySqlParameterCollection parameters, MySqlPacket packet) in G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\Statement.cs:line 182
   at MySql.Data.MySqlClient.Statement.BindParameters() in G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\Statement.cs:line 105
   at MySql.Data.MySqlClient.Statement.Execute() in G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\Statement.cs:line 82
   at MySql.Data.MySqlClient.PreparableStatement.Execute() in G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\PreparableStatement.cs:line 120
   at MySql.Data.MySqlClient.MySqlCommand.ExecuteReader(CommandBehavior behavior) in G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\command.cs:line 431
   at MySql.Data.MySqlClient.MySqlCommand.ExecuteReader() in G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\command.cs:line 330
   at MySql.Data.MySqlClient.StoredProcedure.Close() in G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\StoredProcedure.cs:line 222
   at MySql.Data.MySqlClient.MySqlCommand.Close() in G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\command.cs:line 319
   at MySql.Data.MySqlClient.MySqlDataReader.Close() in G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\datareader.cs:line 182
   at MySql.Data.MySqlClient.MySqlCommand.ExecuteNonQuery() in G:\Projects\IMSP\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\command.cs:line 309
   at SubSonic.MySqlDataProvider.ExecuteQuery(QueryCommand qry) in 

Suggested fix:
If my assumptions are correct my suggestion is to never change a shared object such as the connection, but use a property on the command which is almost always local to the statement that runs at any given point. This property has to read its initial value from the connection string, except this case which should always be true.
[30 May 2008 14:19] Tonci Grgin
Hi Christos and thanks for your report.

Can you please do the following:
  o) attach small, self-sufficient test case I can use to repeat the problem
  o) provide small database dump with all necessary data needed for testing
  o) add info on environment, MySQL server ...
[30 May 2008 14:41] Christos Pavlides
Hi Tonci,
I am using MySql 5.1.24 and InnoDb tables, and calling a stored procedure that does a lot of transactional work and has two out params. The calling application is a multithreaded application server that might have a large number of threads trying to call the same procedure.
I would be happy to provide you with any other information that you might need, but in this case it is quite difficult (due to the complexity of systems). The good news are though that I do not think there is a reason to send you the actual data since it is really irrelevant to the subject. In addition it would be easier to replicate the issue if you put a Sleep call just after the using statement and send a few transactions that will try to use the same connection.
This will delay the reseting of the connection parameter and will make it clear where the problem is.
I hope I provided you with enough information to investigate this further. If you really need an application to replicate this and/or actual db data tell me and I will try to do my best.
[30 May 2008 15:55] Christos Pavlides
Hi Tonci,
if its of any help I tried my suggestion to use a private property on the command object instead of changing the connection parameter and from initial tests it seems that it is working fine. I do not get the error: parameter xxx must be defined. But I will have to do more tests to verify it. It would be very helpful if you could investigate my assumptions and comment on them (If I am on the right path, or totally of topic...).
I have created a patch, but I did it in a hurry, just to test my theory, so it might not be the best way to do it.
[30 May 2008 15:56] Christos Pavlides
My Patch

Attachment: SynchIssue.patch (text/plain), 3.09 KiB.

[30 May 2008 19:06] Reggie Burnett
Christos

I don't follow what you are saying here:

"Although I am not 100% sure I think if you are
using connection pooling, the same connection (with the same connection string params) is part of the connection pool, which can get released at any time between the end of the using statement (using (MySqlDataReader reader = cmd.ExecuteReader())) and the finally."

The connection should not be released back to the pool until you close the connectoin and that should not happen until you have closed the reader which will close the command so I don't see how the connection can be released back to the pool prior to storedprocedure.close() can be called.
[2 Jun 2008 8:21] Christos Pavlides
Hi Reggie,
you are absolutely correct, after analyzing the code a little better it seems that both the actual SP call and the retrieval of the output parameters happen on the same connection which is not closed, therefore my initial analysis was totally off. 
The problem now is why then am I getting the errors about "the parameter must be defined"? Also after I made that patch I am not getting that error any more, although due to the nature of multithreading applications this might be just pure luck (or lack of..). 
Anyway I think the only way this error can happen in my situation is if the Connection.Settings.AllowUserVariables is set to false when it should be true. And the only way this can happen is if two threads change the same connection. I am going to guess again here (please forgive me) and say that if it is impossible to happen at the retrieval of the output parameters (.Close on an SP) then it might be happening when the initial connection is retrieved from the pool (which might be possible since I didn't anywhere a lock on the connection pool when the connection is retrieved).
Please investigate this some more, I am doing the same on my side, trying to find an easier way to replicate this.

Thanks Tonci and Reggie I really appreciate all your help with this.
[2 Jun 2008 10:29] Tonci Grgin
Christos, I am not sure what to do with this report now... Things might be thread-safe but necessarily shareable across threads (which is what I presume is happening). Close?
[2 Jun 2008 11:35] Christos Pavlides
Hi Tonci,
well I do not understand exactly what do you mean, and maybe I did a little too much guestimating myself on this which put you on the wrong tracks. 
The main point of this issue is that without my patch the connector fails with the error about the parameter must be defined under some load. This means that a thread is changing the connection object that is being used by another thread, which shouldn't be happening. We have to fix this, but to properly understand where this is happening, I need some help and guidance from you guys.
As you say some things are common and shareable between threads, and I totally agree with you. But when we have "shared" objects we have to pay some special attention when modifying these objects in order make sure that two threads modifying the same object do so in turns. In this case there might be some situations where two threads try to get a connection and they get the same one. If this is a valid case then we have to find a way to prevent this. 

Basically what I am trying to say here is that I am 90% sure that there is a thread synchronization issue with the driver, but these kind of issues are very hard to track down since in order to simulate the scenario you have to run a multithreaded application and be able to track what is going on with the shared objects. What I am asking is, since you guys wrote the connector and support it you have more knowledge about what these shared objects are, where they are used and how they should be used. Although I can try to find the problem myself, and I am trying for the last couple of days it is consirable more difficult for me since I have to dig into the code and understand why certain operations are being done and why.

I wrote too much again, I hope you understood my point. If you can help me out on this I would really appreciate it, if not I will try to figure it on my one and if I do find a solution I will post a patch.

Thanks again
[2 Jun 2008 11:46] Tonci Grgin
Christos, I get the point. It will be hard to pin this one down but will try to make test case...
[2 Jun 2008 14:08] Christos Pavlides
Continuing on my quest to find the weirdness in my problem I hit another exception of the connector:

System.NullReferenceException: Object reference not set to an instance of an object.
   at MySql.Data.MySqlClient.MySqlPool.CheckoutConnection() in ...\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\MySqlPool.cs:line 130
   at MySql.Data.MySqlClient.MySqlPool.GetPooledConnection() in ...\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\MySqlPool.cs:line 158
   at MySql.Data.MySqlClient.MySqlPool.TryToGetDriver() in ...\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\MySqlPool.cs:line 238
   at MySql.Data.MySqlClient.MySqlPool.GetConnection() in ...\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\MySqlPool.cs:line 251
   at MySql.Data.MySqlClient.MySqlConnection.Open() in ...\Mysql-connector-net-Trunk\MySql.Data\Provider\Source\Connection.cs:line 476
[3 Jun 2008 12:51] Christos Pavlides
Ok guys, this is my final comment on this... I believe I found the place where this is happening. I reviewed this particular part of the code before and I keep coming back to it, because I believe it is not thread safe and it explains why I am getting these exceptions.
The CheckoutConnection() in MySqlPool is doing this:
Driver driver = (Driver)idlePool.Dequeue();
Without any locks anywhere, even from the MSDN docs it says: A Queue<(Of <(T>)>) can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. To guarantee thread safety during enumeration, you can lock the collection during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.

Therefore I think that by not properly locking the queues holding the connections there is a possibility that two threads will get the same driver (my first issue with the parameter not found thingy), there is also a possibily that some threads might fail to get a driver since the check on the count of the queue happens in another function before the CheckoutConnection() (my second type of exceptions where the driver dequeued is null). From the time the counter is checked to the time the queue is dequeued there a posibility that another thread will dequeue the last remaining threads which will result in the null expection I got a couple of times.

Anyway I would appreciate your help with this. I will try to create a simple test that will demonstrate this.
[3 Jun 2008 14:41] Reggie Burnett
I just emailed you our latest mysqlpool.cs which had that issue fixed recently.  Can you test it (without your other patch) and tell me if it fixes your issue?
[4 Jun 2008 12:45] Christos Pavlides
Ok guys I think I found the reason for this wierdness... After days of testing and trying to simulate the problem using multithreading applications unit tests etc, I realised that the MySqlConnection has a property ConnectionString which uses caching to cache previous settings. When a new connection is created and the settings are found in the cache then the cached settings are set to the current connection settings:
settings = newsettings;
which is the the main cause of the connector misbehaving. The two variables are objects which are copied by reference which means that when you open 100 connections and you change the setting.AllowUserVariables then it will change for all connections, which is what is causing some of the problems in my case. (The other half of my problems are solved using Reggie's changes by locking the pools.) 

Therefore I have two posible solutions:
1) Make the MySqlConnectionStringBuilder Clonable and instead of settings = newsettings do settings = newsettings.Clone(), but since this is a huge object it will take forever.
2) Use my initial patch and make sure the settings of a connection are never changed directly (like the AllowUserVariables).

Please comment on my findings. 

Thanks for all your help
[4 Jun 2008 13:31] Tonci Grgin
Christos, thank you very much for your efforts and analysis. I do remember this was a problem before too, when we had ever-growing connection string. I believe there is a good reason for this implementation and will leave final ruling to Reggie.

Thanks again for your contribution and your interest in MySQL!
[6 Jun 2008 18:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/47545
[6 Jun 2008 18:21] Reggie Burnett
Can you check this patch on your code and see if it fixes the issue.  I haven't reproduced the problem here.
[9 Jun 2008 6:10] Christos Pavlides
On first glance it seems to work. Thanks. But I will try to run my test cases to see if we have missed something. 
By the way I am not sure if you received my e-mails but it seems there is one small problem related to threading. This time it is a thread sync issue. If you want me to open a new issue I will, but here is the short story:
GetDefaultCollation and GetMaxLength call the db to get a set of parameters and cache them in two static dictionaries in function InitCollections. These functions are not thread safe since if many threads try to call a procedure for the first time they will all try to insert the same keys in these collections and a lot of duplicate key exceptions will be thrown.
[13 Jun 2008 8:08] Christos Pavlides
Hi Reggie,
the fix seems to be working properly, but did you check the other issue I mentioned on my last comment?
[18 Jun 2008 6:31] Christos Pavlides
Hi Reggie/Tonci,
I can see that the status of this issue is still "patch pending". When will this be implemented? Also can you tell me if you are going to look at the other issue I mention with the GetDefaultColation? or should I open a new issue for this?

Thanks again
[22 Oct 2008 0:03] Jared S
Hi Christos,  Open up another bug regarding thread safe and link to this report.
[22 Oct 2008 6:29] Christos Pavlides
What do you mean?
[22 Oct 2008 6:33] Jared S
You mentioned that GetDefaultColation proc is not thread safe and may be another bug in its own right.
[22 Oct 2008 6:43] Tonci Grgin
Christos, two things:
  o Jared is right in that two problems should not be reported in one bug report.
  o I will ping Reggie regarding status of patch.
[22 Oct 2008 7:32] Christos Pavlides
ok opened a new issue Bug #40231
[1 Sep 2009 20:10] Reggie Burnett
This patch was pushed to 5.2 months ago.  Not sure why this got stuck in patch pending status