Description:
After upgrading to the .net mysql connector 8.0.32.1 (I verified that it started in 8.0.31 though) we started having server crashes in production.
After some experimentation and research, I found that connections were keeping transactions open and in a bad state. This was also poisoning the connection pool with these same bad connections since they were being recycled.
Over time the pool would hit a tipping point where the odds of getting a bad connection were high enough to where the application server would just start to die/crash.
The bad transactions/connections would also hold locks on specific rows-- this made it more likely for the incident to happen when retrying an insert.
How to repeat:
It's notable that on connector 8.0.30, this issue does not happen. The transaction will fail, but then the second insert succeeds.
Create the following schema.
CREATE TABLE `Test` (
`id` int NOT NULL AUTO_INCREMENT,
`value` varchar(50) COLLATE latin1_general_cs NOT NULL,
`uniqueIndex` varchar(255) COLLATE latin1_general_cs DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uniqueIndex` (`uniqueIndex`),
KEY `value_index` (`value`)
) ENGINE=InnoDB AUTO_INCREMENT=139 DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs
And use the following .net program.
The program succeeds if the line that has the message "The item with ID {item.UniqueIndex} and primary key {item.Id} was succesfully written and read!" executes.
It will succeed on .net connector 8.0.30 and fail on 8.0.31 plus.
using MySql.Data.EntityFramework;
using MySql.Data.MySqlClient;
using System;
using System.ComponentModel.DataAnnotations.Schema;
using System.Configuration;
using System.Data.Entity;
using System.Data.Entity.SqlServer;
using System.Diagnostics;
using System.Linq;
using System.Linq.Expressions;
using System.Threading;
using System.Threading.Tasks;
namespace Shared
{
public static class Repro
{
private static Random random = new Random();
public static void DoIt()
{
// Configure Logging
//MySql.Data.MySqlClient.MySqlTrace.Switch.Level = SourceLevels.All;
//MySql.Data.MySqlClient.MySqlTrace.Listeners.Add(new TextWriterTraceListener(Console.Out));
var uniqueId = random.Next(1000000, 9999999).ToString();
WriteRowToDbInTransaction(uniqueId, forceTimeout: true);
WriteRowToDbInTransaction(uniqueId, forceTimeout: false);
Console.WriteLine("done");
Console.ReadKey();
}
private static void WriteRowToDbInTransaction(string uniqueId, bool forceTimeout)
{
try
{
using (var db = new TestContext())
using (var transaction = db.Database.BeginTransaction())
{
try
{
db.Database.CommandTimeout = 1;
db.Test.Add(new Test
{
Value = "hello",
UniqueIndex = uniqueId
});
Console.WriteLine("trying to save changes");
db.SaveChanges();
Console.WriteLine("trying to save commit");
if (forceTimeout)
{
Console.WriteLine("forcing timeout");
db.Database.ExecuteSqlCommand("select sleep(2);");
}
var item = db.Test.Where(x => x.UniqueIndex == uniqueId).FirstOrDefault();
Console.WriteLine($"The item with ID {item.UniqueIndex} and primary key {item.Id} was succesfully written and read!");
transaction.Commit();
}
catch (Exception ex)
{
transaction.Rollback();
}
}
}catch(Exception ex)
{
Console.WriteLine(ex.Message);
}
}
}
[DbConfigurationType(typeof(MySqlEFConfiguration))]
public class TestContext : DbContext
{
public TestContext() : base("Database") {
System.Data.Entity.Database.SetInitializer<TestContext>(null);
}
public DbSet<Test> Test { get; set; }
}
[Table("Test")]
public class Test
{
public int Id { get; set; }
public string Value { get; set; }
public string UniqueIndex { get; set; }
}
}
Suggested fix:
I am suspicious that the change listed here https://bugs.mysql.com/bug.php?id=107110 might be related to the problem.
The nature of that work had an impact on transactions and the way connection pooling work.
This bug also lists 107110 as being suspicious https://bugs.mysql.com/bug.php?id=109846.