Bug #71832 Connector/NET batch insert not working (UpdateBatchSize)
Submitted: 25 Feb 2014 21:40 Modified: 7 Jan 2015 13:15
Reporter: Marcos Albe Email Updates:
Status: Verified Impact on me:
None 
Category:Connector / NET Severity:S3 (Non-critical)
Version:6.8.3, 6.9.5 OS:Any
Assigned to: CPU Architecture:Any

[25 Feb 2014 21:40] Marcos Albe
Description:
When UpdateBatchSize=200 we see the following (WRONG) behavior: 

For 3000 records the adapter will produce

A) 200 "SELECT @@sql_mode" statements (200 because that is the batch size)

followed by 

B) 200 separate single-row INSERT statements (in a single round-trip command) of the form "INSERT foo (bar1,bar2) VALUES (1,2); INSERT foo (bar1,bar2) VALUES (2,3);"

Correct Behavior after suggested fix:

For 3000 records, general log will have a single "SELECT @@sql_mode" statement followed by 15 Multi-row inserts - each with 200 rows per insert statement.  These are of the form "INSERT foo (bar2, bar2) VALUES (1,2), (2,3) <...>(200, 201);"

How to repeat:
// from a prior function:
// MySqlConnection m_dbSource, m_dbTarget;
// These are two already-opened MySQL databases. Server version does not matter.
 
string m_sSourceSelectSQL = "SELECT * FROM foo"; // FOO can be any table on m_dbSource
string m_sTargetTableName = "foo"; // table already created on m_dbTarget, identical to table foo above-but empty
int nRowBatchSize = 200;
MySqlCommand msCommandSelect = m_dbSource.CreateCommand();
 
msCommandSelect.CommandText = m_sSourceSelectSQL;
MySqlDataReader mSqlReader_Source;

mSqlReader_Source = msCommandSelect.ExecuteReader();

// if we did not get any rows
if (!mSqlReader_Source.HasRows)
{
    mSqlReader_Source.Close();
    return 0;
}

// load the datareader into this data table.
DataTable dtSourceDataTable = new DataTable();
dtSourceDataTable.Load(mSqlReader_Source);

// copy into a new datatable - we will use this to insert later.
DataTable dtCopyDataTable = dtSourceDataTable.Copy();

nSourceCount = dtCopyDataTable.Rows.Count;

// close the source - we're done with it.
mSqlReader_Source.Close();

// Get an adapter for the target
MySqlDataAdapter mSqlDA = new MySqlDataAdapter("SELECT * FROM " + m_sTargetTableName, m_dbTarget);

// Create an Insert command and get it ready for an insert.
MySqlCommandBuilder cb = new MySqlCommandBuilder(mSqlDA);

mSqlDA.InsertCommand = cb.GetInsertCommand();
mSqlDA.UpdateBatchSize = nRowBatchSize; // do 1000 at a time?  Hmm...
mSqlDA.InsertCommand.CommandText = " " + mSqlDA.InsertCommand.CommandText;

// SetAdded will mark each row to be inserted.
foreach (DataRow row in dtCopyDataTable.Rows)
{
    row.SetAdded();
}

int nTargetCount = 0;

// this will do the insert right here.
nTargetCount = mSqlDA.Update(dtCopyDataTable);

// NOTE: you must turn on the general log on the target DB server first.
// Then you must check the target’s general log to confirm the behavior. 
// The data result is the same between the deficient and fixed versions.
// The difference between the deficient version and my fixed version is the performance and the absence of
// redundant SELECT @@sql_mode statements, one per row.

Suggested fix:
Line numbers based on http://dev.mysql.com/get/Downloads/Connector-Net/mysql-connector-net-6.8.3.msi

It's a small bug in the Mysql.Data.MySqlClient.Statement object, specifically the BindParameters function. The file is Statement.cs and the current broken line is (line 134 of Statement.cs):

string text = ResolvedCommandText;

This line should be:

string text = batchedCmd.BatchableCommandText;

Because of the way this function is written, ResolvedCommandText -always- contains the initial INSERT statement - it can never begin with a "(". As a result, batched inserts are never done. The very next lines of code are:

if (text.StartsWith("(", StringComparison.Ordinal))
    packet.WriteStringNoNull(", ");
else
    packet.WriteStringNoNull("; ");

Clearly the author's intent was to use the batchCmd's text. But in fact if you search through the entire codebase for batchableCommandText, it is never actually used for anything without the line I changed above. It is set and, it is copied when cloned - but it is never used.

I changed this line, built a version, and it worked as designed (both in multi-row inserts and plain old statements).

I have also eliminated the SELECT @@SQL_MODE statements by changing these lines in

// line 902 of command.cs:
Mysql.Data.MySqlClient.MysqlCommand.GetCommandTextForBatching 

MySqlCommand cmd = new MySqlCommand("SELECT @@sql_mode", Connection);
string sql_mode= StringUtility.ToUpperInvariant(cmd.ExecuteScalar().ToString());

becomes

if (sql_mode == null) {
  MySqlCommand cmd = new MySqlCommand("SELECT @@sql_mode", Connection);
  sql_mode = StringUtility.ToUpperInvariant(cmd.ExecuteScalar().ToString());
}

And sql_mode was added to the class as a private string member variable (line 69 of this file):

private string sql_mode;

(code and fix by Brian Brown from Broadnet.us)
[7 Jan 2015 13:15] Chiranjeevi Battula
Hello Marcos Albe,

Thank you for the bug report.
Verified this behavior on Visual Studio 2013 (C#.Net) with  MySQL Connector/Net 6.9.5.

Thanks,
Chiranjeevi.