Bug #29289 MySqlDataReader does not close when .Read() throws a Deadlock Exception
Submitted: 21 Jun 2007 22:22 Modified: 27 Nov 2007 14:57
Reporter: Dan Bress Email Updates:
Status: Can't repeat Impact on me:
None 
Category:Connector / NET Severity:S2 (Serious)
Version:1.0.9 OS:Windows
Assigned to: CPU Architecture:Any
Tags: close, hang, MySqlDataReader

[21 Jun 2007 22:22] Dan Bress
Description:
When the data reader throws an exception(Deadlock found when trying to get lock; try restarting transaction|MySql.Data.MySqlClient.MySqlException) I am unable to close my DataReader, when using the direct mysql connector(non-odbc).

I do the following
create a connection
"select * from table for update"
try
{
  while (reader.Read())
  {
    read data
  }
}
finally
{
  close reader
  close connection
}

when i close the reader it hangs.  here is is the stack trace
mysql.data.dll!MySql.Data.MySqlClient.PacketReader.ReadHeader() Line 96 + 0xb bytes C#
mysql.data.dll!MySql.Data.MySqlClient.PacketReader.OpenPacket() Line 132 C#
mysql.data.dll!MySql.Data.MySqlClient.NativeDriver.OpenDataRow(int fieldCount = 0, bool isBinary = false) Line 464 C#
mysql.data.dll!MySql.Data.MySqlClient.CommandResult.Consume() Line 222 + 0xe bytes C#
mysql.data.dll!MySql.Data.MySqlClient.MySqlDataReader.Close() Line 133 C#

How to repeat:
continually do a select for update from a table from program 1
continually do an insert on duplicate key update on the same table from program 2
wait until your read throws an exception, and you are then unable to close the data reader.

Suggested fix:
It appears as though a dead lock happens and the server says "I can't send you any data to read"

mean while when you close the reader it thinks "I still have data left, I should read it" (CommandResult.cs:220) but there is nothing left to read, so it blocks indefinitely.  Someone that information that an exception has occured, and that there is no data left needs to be passed to this point.

I commented out
if (!readRows)
			{
				while (driver.OpenDataRow(0, false)) { }
				readRows = true;
			}
in the CommandResult.cs file and this "solved" the problem.
[21 Jun 2007 22:26] Dan Bress
When I say the reader "does not close" I mean it blocks indefinitely.
[26 Jul 2007 8:04] Tonci Grgin
Not enough information was provided for us to be able to handle this bug. Please re-read the instructions at http://bugs.mysql.com/how-to-report.php

If you can provide more information, feel free to add it to this bug and change the status back to 'Open'.

Thank you for your interest in MySQL.

Explanation: Hi Dan and thanks for your report.
Please attach here:
 - MySQL server version, my.ini/cnf file
 - DDL script suitable for import with mysql cl client
 - Small but complete test case showing this problem every time it is run
I presume your NET FW is 1.x, right?
[15 Aug 2007 17:49] Michael Maggi
mySqlRsCommand.CommandText = SearchQuery
        mySqlDataReader = mySqlRsCommand.ExecuteReader()

        If mySqlDataReader.HasRows Then

          ' just do 100 rows at a time untill we are done
          Do While mySqlDataReader.Read() And nCounter < CInt(MAX_SKU_ITEMS + 1)
           ' do some stuff
          Loop ' while adoDataReader.HasRows

        End If ' mySqlDataReader.HasRows

        If Not mySqlDataReader.IsClosed Then

          Try

            mySqlDataReader.Close()
            MessageBox.Show("Reader Closed Successfully")

          Catch mySqlException

            MessageBox.Show("Reader NOT Closed Successfully" & mySqlException.Message)
            mySqlDataReader.Dispose()
            Exit Sub

          End Try

        End If

It kind of just sits there and never closes the reader, I don't get any timeout's or exceptions.

VS 2005 - MySqlConnector 5.2
WinXP - 2.0G ram, dual core, intel

M.Maggi MvInTech Rome,NY
[26 Aug 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[17 Oct 2007 16:39] hank kniight
Actually there's an easier way to make the code freeze. You just have to do something like

IDataReader reader = ... "select * from ..."; // returns for example 10 rows
reader.Read(); // only reads the 1st raw
reader.Close(); // tries to close the reader while we still have rows to read

=> the call to reader.Close never returns and the thread just freezes!

When debugging I went into the file "MySqlStream.cs" in the method "public int Read(byte[] buffer, int offset, int count)" with the arguments
    buffer = byte[1024] only containing 0s
    offet = 1
    count = 1023

Inside the method I have
    peekByte = -1
    inLength = 16462081
    inPos = 1
    (thus inPos != inLength)
    num2 = 1023

Then "int num3 = this.inStream.Read(buffer, offset, num2);" is called but it never returns (inStream being a System.IO.BufferedStream).

Looking at the code of BufferedStream.Read and giving the values of its parameters, it only calls
    Buffer.InternalBlockCopy(this._buffer, this._readPos, array, offset, num);
    this._readPos += num;
    if (num < count)
    {
        int num2 = this._s.Read(array, offset + num, count - num);
        num += num2;
        this._readPos = 0;
        this._readLen = 0;
    }
    return num;

I don't know the code of Buffer.InternalBlockCopy so I don't know if it may freeze (_buffer = byte[4096] with some values, _readPos = 2043, array = byte[1024] still with 0s, offset = 1, num = 7).

"int num2 = this._s.Read(array, offset + num, count - num);" can freeze because _s is a NetworkStream with ReadTimeout = -1 (array = byte[1024] still whith 0s, offet + num = 1+7, count - num = 1027 - 7).
[18 Oct 2007 6:44] hank kniight
I had this error with connection .Net 5.1.0 but not with 5.1.3
[19 Oct 2007 15:13] Tonci Grgin
I am having troubles setting up legal software environment for testing NET FW 1.1 bugs... VSExpress works only with NET FW 2.0 so I'll have to find someone with VS2003.
[27 Nov 2007 14:57] Tonci Grgin
Hi all. Finally I have NET FW 1 environment...

Testing c/NET 1.0.10.1 with following test case against remote MySQL server 5.0.54BK and ~200000 rows x 16 columns table I can't repeat the problem reported:

	MySqlConnection conn = new MySqlConnection();
	conn.ConnectionString = "DataSource=QCore;Database=solusd;UserID=root;Password=;PORT=3306";
	conn.Open();

	MySqlCommand command = new MySqlCommand();
	command.CommandText = "SELECT * FROM `big_table` FOR UPDATE";
	command.Connection = (MySqlConnection)conn;

	MySqlDataReader dr = command.ExecuteReader();
	Console.WriteLine("Before READ");
	dr.Read();
	Console.WriteLine("After READ");
	Console.WriteLine(dr.GetName(0));
	try 
	{
		while (dr.Read())
		{
			Console.WriteLine(dr.GetInt32(0));
			//dr.Close(); << If called here, reading stops properly (with an error)
		}
	}
	finally 
	{
		dr.Close();
		Console.WriteLine("Ready!");
		Console.Read();
		command.Dispose();
		conn.Close();
	}

and the output is:
...
194309
194310
194311
194312
194313
194314
Ready!

If anyone can post a complete sample producing the error every time it is run I will retest.
[12 Feb 2008 3:59] Keith L
Hi there,

I came across this issue after debugging a consistent server hang on my ASP.Net 2.0 website using the 5.0.6.0 version of the MySql.Data.dll. I am not sure if this is resolved or fixed already, but since it is marked as "not repro", perhaps I can offer my findings and someone at MySql can take a look.

So consistent with what many have been saying, if you perform enough ExecuteReader() calls that result in a failure (i.e. an exception is thrown, either because the query was malformed, or that you are creating a conflict such as inserting duplicate keys), eventually, your server application will DEADLOCK.

This one is quite tricky, but let me start with what I saw on windbg, and then I will point out what I think is the problem code.

I have about 70 threads waiting on DIFFERENT SafeWaitHandles on my server when the deadlock happened. This is most unusual because this type of contention is usually caused by many threads waiting on the same resource (or a circular resource wait). But this time, each thread is waiting on ITS OWN wait object.

All of the worker thread callstacks look like the following:

0:020> !clrstack
OS Thread Id: 0x3a8 (20)
ESP       EIP     
0e4af298 7c8285ec [HelperMethodFrame_1OBJ: 0e4af298] System.Threading.WaitHandle.WaitOneNative(Microsoft.Win32.SafeHandles.SafeWaitHandle, UInt32, Boolean, Boolean)
0e4af344 793b03fe System.Threading.WaitHandle.WaitOne(Int64, Boolean)
0e4af35c 793b0c1b System.Threading.WaitHandle.WaitOne(Int32, Boolean)
0e4af36c 793b044e System.Threading.WaitHandle.WaitOne()
0e4af370 0f68a561 MySql.Data.MySqlClient.MySqlCommand.TimeoutExpired(System.Object)
0e4af384 793af6c6 System.Threading._TimerCallback.TimerCallback_Context(System.Object)
0e4af388 793740ab System.Threading.ExecutionContext.runTryCode(System.Object)
0e4af7a8 79e7c74b [HelperMethodFrame_PROTECTOBJ: 0e4af7a8] System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode, CleanupCode, System.Object)
0e4af810 79373ff7 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
0e4af828 79373ede System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
0e4af840 793af647 System.Threading._TimerCallback.PerformTimerCallback(System.Object)
0e4af9d0 79e7c74b [GCFrame: 0e4af9d0] 
0e4afb1c 79e7c74b [ContextTransitionFrame: 0e4afb1c] 

So it is clear that all threads are stuck on MySql.Data.MySqlClient.MySqlCommand.TimeoutExpired. So looking at that code, it's easy to realize that it is waiting on a synchronization object called "querySent", as the following snippet from command.cs suggests:

private void TimeoutExpired(object commandObject)
{
    MySqlCommand cmd = (commandObject as MySqlCommand);

    // wait for the query to be sent
    querySent.WaitOne();

    if (cmd.canCancel)
    {
        cmd.timedOut = true;
        cmd.Cancel();
    }
}

Turns out this method is a callback function from a Timer object created in ExecuteReader(), as seen in the following code snippet in command.cs:

public new MySqlDataReader ExecuteReader(CommandBehavior behavior)
{
    // A bunch of irrelevant stuff removed ...

    // About line 384:
    try
    {
        MySqlDataReader reader = new MySqlDataReader(this, statement, behavior);

        // start a threading timer on our command timeout 
        timedOut = false;
        Timer t = null;
        querySent.Reset();
#if !DEBUG
        if (connection.driver.Version.isAtLeast(5, 0, 0) &&
            commandTimeout > 0)
        {
            TimerCallback timerDelegate =
                    new TimerCallback(TimeoutExpired);
            t = new Timer(timerDelegate, this, this.CommandTimeout * 1000, 
                    Timeout.Infinite);
        }
#endif

        // execute the statement
        statement.Execute();
        querySent.Set();

        canCancel = true;
        reader.NextResult();
        if (t != null)
            t.Dispose();
        canCancel = false;
        connection.Reader = reader;
        return reader;
    }
    catch (MySqlException ex)
    {
        // if we caught an exception because of a cancel, then just return null
        if (ex.Number == 1317)
        {
            if (timedOut)
                throw new MySqlException(Resources.Timeout);
            return null;
        }
        throw;
    }
    finally
    {
        querySent.Reset();
        canCancel = false;
    }
}

So there are a few lines of very questionable code wrapped around an #if !DEBUG that is the culprit of this problem. Apparently, the developer wanted to enforce the timeout in RETAIL BUILDS only (perhaps the developer was getting deadlocks and didn't want to deal with it in his/her DEV/DEBUG builds). But therein lies a subtle bug.

Consider this codepath in RETAIL:

1. The querySent object is Reset().

2. A Timer object is created and programmed to invoke TimeoutExpired() after this.CommandTimeout seconds (default is 30 seconds).

3. statement.Execute() throws an exception (e.g. invalid SQL statement or index conflict)

4. ExecuteReader() catches the exception locally and bubbles the exception to the caller. But before leaving the scope, the "finally" clause resets querySent.

5. Control returns to the caller, which deals with the exception accordingly.

6. After this.commandTimeout seconds, the Timer invokes TimeoutExpired(), and the thread will wait infinitely on querySent.WaitOne(). This thread is stuck and will never return to the thread pool.

As you can see, if you issue enough ExecuteReader() commands that fail, your thread pool will eventually hit the max thread count and your Application instance will DEADLOCK. This is very dangerous because a lot of code will first try to insert, and if the row exists (insert fails), it will update the row instead. With this bug, the server is guaranteed to run out of threads sooner or later.

There's a bit of spaghetti in this callback code, and I don't have a good suggestion except to think through the timing windows and refactor the code. I am considering removing the conditional code altogether, or moving the code to AFTER the statement.Execute() call, but still I am worried about other undesirable side effects, e.g. hanging DURING the statement.Execute() call, but I think that is a much rarer occurence than the current bug and will not result in a Deadlock.

I hope this explains my findings satisfactorily, should you need any more clarification, please feel free to contact me for further discussion. I am not sure if the same repros for Version 1.0.9.0.

Thanks,
- Keith
[12 Feb 2008 6:02] Tonci Grgin
Hi Keith and thanks for excellent analysis. I will look into this part of code again, but, for now, you sure make sense. I have but one thing to correct:
"... Apparently, the developer wanted to enforce the timeout in RETAIL BUILDS only (perhaps the developer was getting deadlocks and didn't want to deal with it in his/her DEV/DEBUG builds)..."
This was introduced by request to provide the way to cancel long-running operations, no-one's been lazy.
[12 Feb 2008 17:18] Keith L
Hi Tonci, thanks for the note. Sorry about the developer comment there, it's supposed to be humorous, but now that I read it again I suppose it's not. My apologies.  :-(

Anyway, I think it's beneficial to make that code available in both DEBUG and RETAIL builds. That will give the code more exposure for testing and it should work equally well in both build types anyway. Moreover, this will help prevent unexpected (and hard-to-debug) issues for people who do initial development in DEBUG and then switch to RETAIL closer to deployment.

As an aside, I removed the entire "#if !DEBUG" block and rebuilt my own MySql.Data.dll and my site has been working fine since last night, and the usual repro case no longer causes the application to "hang" quickly. I will keep an eye on it and will report further findings as things unfold.

Thanks,
- Keith
[23 Jun 2008 11:27] Ted Ekeroth
Has this been resolved?

I am having the same problem using 5.1.6 of .NET connector. 

Some of Keith Ls suggestions seem to be incorporated in the newer version, but the problem still exists. 

The Connector creates threads that are locked in WaitOne and thus takes up a lot of virtual memory aswell as consuming threads in the thread pool.

I have made a post about this here:
http://forums.mysql.com/read.php?38,214687,214687#REPLY

and here is some of what I wrote:
----------

I noticed recently that a very VERY small application, that only has 1 timer and the timer callback method, after a short while occupies over 500 MB in Virtual Memory ("VM Size" in Task Manager) and I could not figure out why at first.

I am using a profiler that tells me that over 500 threads are running in the application that really should only be using 1 thread. 

.NET reservers 1 MB of VM for each thread that is created which correlates almost perfectly with the amount of memory the small application is using.

THE PROBLEM is that it seems to be a MySql.Data.dll problem. The MySql.Data.dll creates huge amounts of threads and those threads never seem to go away. In the profiler I get the following log for one of those many threads: 

Unnamed#87 
System.Threading.WaitHandle.WaitOne(unknown source) 
System.Threading.WaitHandle.WaitOne(unknown source) 
System.Threading.WaitHandle.WaitOne(unknown source) 
MySql.Data.MySqlClient.MySqlCommand.TimeoutExpired(unknown source) 
System.Threading._TimerCallback.TimerCallback_Context(unknown source) 
System.Threading.ExecutionContext.Run(unknown source) 
System.Threading._TimerCallback.PerformTimerCallback(unknown source) 

This happens for almost every app I have using the newer version, 5.0.6.0 (260kb in size). But when using the older version 1.0.7.30072 (140kb in size) the problem doesnt exist - the TimeoutExpired doesnt occur and the threads are not created. 

-----------------------

I am thinking of just commenting out line 395 to 401 in command.cs and thus removing the timeout and callback to TimeoutExpired to get rid of this problem.

Any comments would be greatly appreciated.

Regards
[23 Jun 2008 23:44] Ted Ekeroth
It seems that I had 5.0.6 "installed" and thus the application ignored the 5.1.6 DLL placed in the application directory and instead used the 5.0.6.

I have now uninstalled the 5.0.6 and hopefully I'm running the app with 5.1.6 this time. I will see if it works better now =)
[1 Dec 2008 14:54] Sergey Morozov
My test is blocked in the transaction.
See code commented "!!! Is blocked here"

using System;
using System.Collections.Generic;
using System.Text;
using log4net.Appender;
using log4net.Config;
using NUnit.Framework;
using NUnit.Framework.SyntaxHelpers;
using System.Threading;
using MySql.Data.MySqlClient;
using log4net;
using System.Data;
using System.Reflection;

namespace MySql.Data.MySqlClient.Tests.Source
{
	[TestFixture]
	public class CorruptThread
	{
		enum CorruptDBThreadState
		{
			Init,
			CheckConnection,
			FirstSelect,
			InTransaction,
			Stopped
		}

		class CorruptDBThread
		{
			private CorruptDBThreadState _state = CorruptDBThreadState.Init;
			private Thread _thread;

			private ILog _logger = LogManager.GetLogger(typeof(CorruptDBThread));

			public CorruptDBThreadState State
			{
				get { return _state; }
			}

			public Thread Thread
			{
				get { return _thread; }
			}

			string GetConnectionString()
			{
				return "server=testsql.analit.net;username=system; password=newpass; database=farm; pooling=true;";
			}

			MySqlConnection GetConnection()
			{
				return new MySqlConnection(GetConnectionString());
			}

			public CorruptDBThread()
			{
				_thread = new Thread(new ThreadStart(ThreadMethod));
				_thread.Name = "PPT" + _thread.ManagedThreadId;
				_thread.Start();
				_logger.Info("Start");
			}

			private void SimpleCheckConnection(MySqlConnection myconn)
			{
				if (myconn.State != ConnectionState.Open)
					myconn.Open();
				try
				{
					myconn.Ping();
					object _now = MySqlHelper.ExecuteScalar(myconn, "select now()");
				}
				finally
				{
					myconn.Close();
				}
			}

			private void ThreadMethod()
			{
				try
				{
					MySqlConnection _connection = GetConnection();
					_logger.Info("GetConnection()");

					_state = CorruptDBThreadState.CheckConnection;
					SimpleCheckConnection(_connection);

					_connection.Open();
					try
					{
						MySqlTransaction _selectTransaction = _connection.BeginTransaction();
						_state = CorruptDBThreadState.FirstSelect;
						try
						{
							DataSet dsColumns = MySqlHelper.ExecuteDataset(_connection, "select * from information_schema.`COLUMNS`");
							Thread.Sleep(500);
							DataSet dsColumnsPrivileges = MySqlHelper.ExecuteDataset(_connection, "select * from information_schema.COLUMN_PRIVILEGES");
						}
						finally
						{
							_selectTransaction.Commit();
						}
						_logger.Info("FirstSelect");
					}
					finally
					{
						_connection.Close();
					}

					Thread.Sleep(1000);

					_connection.Open();
					try
					{
						MySqlTransaction _brokenTransaction = _connection.BeginTransaction();
						_state = CorruptDBThreadState.InTransaction;
						try
						{
							DataSet dsColumns = MySqlHelper.ExecuteDataset(_connection, "select * from information_schema.`COLUMNS`");

							Thread.Sleep(3000);

							_brokenTransaction.Rollback();
						}
						catch (Exception onTransaction)
						{
							_logger.Error("Error in transaction", onTransaction);
							if (_brokenTransaction != null)
								try
								{
									_logger.Info("Begin rollback");
									//!!! Is blocked here
									_brokenTransaction.Rollback();
									_logger.Info("End rollback");
								}
								catch (Exception onRollback)
								{
									_logger.Error("Error on rollback", onRollback);
								}
							throw;
						}
					}
					finally
					{
						_connection.Close();
					}
				}
				catch (Exception exception)
				{
					_logger.Error("Error in thread", exception);
				}
				finally
				{
					_state = CorruptDBThreadState.Stopped;
					_logger.Info("Stop");
				}
			}
		}

		[Test]
		public void CorruptConnectionTest()
		{
			BasicConfigurator.Configure(
				new TraceAppender(
					new log4net.Layout.PatternLayout("%date{ABSOLUTE} [%-5thread] %-5level %logger{1} %ndc - %message%newline")));
			ILog _logger = LogManager.GetLogger(typeof(CorruptThread));

			List<CorruptDBThread> _threads = new List<CorruptDBThread>();

			_logger.Info("Start test");
			_threads.Add(new CorruptDBThread());
			_threads.Add(new CorruptDBThread());

			int _stopCount = 0;
			while (_stopCount < 5)
			{
				Thread.Sleep(100);

				for (int i = _threads.Count - 1; i >= 0; i--)
				{
					if (_threads[i].State == CorruptDBThreadState.Stopped)
					{
						string _deletedThreadName = _threads[i].Thread.Name;
						_threads.RemoveAt(i);
						_stopCount++;
						_logger.InfoFormat("Delete thread {0}", _deletedThreadName);
					}
					else
						if ((_threads[i].State == CorruptDBThreadState.InTransaction) &&
							((_threads[i].Thread.ThreadState & ThreadState.AbortRequested) == 0))
						{
							_threads[i].Thread.Abort();
							_logger.InfoFormat("Call Abort() for thread {0}", _threads[i].Thread.Name);
						}
						else
							if ((_threads[i].State == CorruptDBThreadState.InTransaction) &&
								((_threads[i].Thread.ThreadState & ThreadState.AbortRequested) > 0) &&
								((_threads[i].Thread.ThreadState & ThreadState.WaitSleepJoin) > 0))
							{
								_threads[i].Thread.Interrupt();
								_logger.InfoFormat("Call Interrupt() for thread {0}", _threads[i].Thread.Name);
							}
				}

				while (_threads.Count < 2)
					_threads.Add(new CorruptDBThread());
			}

			_logger.Info("End test");
		}
	}

}
[23 Dec 2010 14:52] Jalal Chaer
Status: 	Can't repeat 
Of course you can't repeat, this happens because the connector is not thread safe ... or your server is not !

I got this error today when I ran a delete query along with another delete query in a multi-threaded way...

and check the above posted sample