Bug #45512 | Out of memory on prepated statement bulk insert | ||
---|---|---|---|
Submitted: | 16 Jun 2009 2:13 | Modified: | 15 Oct 2009 10:58 |
Reporter: | Chris Khan | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: Prepared statements | Severity: | S3 (Non-critical) |
Version: | 6.0.9 | OS: | Windows |
Assigned to: | CPU Architecture: | Any | |
Tags: | bulk insert, out of memory, prepared statements |
[16 Jun 2009 2:13]
Chris Khan
[16 Jun 2009 5:12]
Sveta Smirnova
Thank you for the report. Please provide part of code showing how do you do bulk insert.
[16 Jun 2009 14:36]
Chris Khan
.NET Connector 6.0, very straighforward code: MySqlConnection connection = new MySqlConnection(connectionString); connection.Open(); MySqlTransaction transaction = connection.BeginTransaction(); command.Connection = connection; command.Transaction = transaction; command.Prepare(); using (MySqlDataReader reader = command.ExecuteReader()) { code... } code... The SQL is basically: SET @var11 = ?var11; repeat 11k times... INSERT INTO `table` (`col1`, ..., `col11`) VALUES (@var11, ..., @var111), (@var21, ..., @var211), repeat 1k times
[16 Jun 2009 20:20]
Sveta Smirnova
Thank you for the feedback. > SET @var11 = ?var11; > repeat 11k times... This allocates 11000 * size_of_var bytes. And this memory not freed until you close connection. You have TEXT columns in your table, so I expect it can be quite big. Memory error is expected in this case. So this is not MySQL bug. Please follow suggestion you got at forum.
[16 Jun 2009 20:46]
Chris Khan
I thought that that might be the case, but there are a few reasons that I'm pretty certain that it's not as simple as a memory issue. First, this worked fine on version 5.0, and very quickly. Second, looking at my system resources, the memory consumed seems to be on the order of 1GB. The maximum size of each row (in the particular case where I get the error), is less than 1KB. At 1K rows, that should be on the order of 1MB. Third, the error that I pasted in my form post states that the memory needed was "2490020 bytes". That matches the expected size if each variable and placeholder is stored (~2MB), but does not match the system resources used. Finally, I can't imagine why ~2MB would overload the system. I didn't notice any server parameters that clearly pointed to ~2MB being enough to crash the server. Regrettably, I did not test this with varchar columns in place of those text columns for a little added information, but I am fairly confident that it wouldn't have made a difference, unless there was a bug with MySQL. The maximum size of each text column was definitely <300 bytes. I wish I knew more about MySQL to be certain about some of this, but I know enough to know that this is very likely either a bug in MySQL functionality or error-handling and messages. If you're still not convinced, can you at least point me to the server variable that would not allow ~2MB worth of query variables?
[16 Jun 2009 21:10]
Sveta Smirnova
Thank you for the feedback. If you allocate not much data it is worth testing for real regression. Please provide one set of real values (for 1 row), so we can guess real size. Do you run same set of queries concurrently? Also you specified you use prepared statements. Do you use 11k prepared statements to execute following code? > SET @var11 = ?var11; >repeat 11k times... Which version of c/NET do you use? > Third, the > error that I pasted in my form post states that the memory needed was "2490020 bytes". > That matches the expected size if each variable and placeholder is stored (~2MB), > but does not match the system resources used. This should not much the system resources used. mysqld can fail because it needs even 1 byte if this 1 byte can not be allocated.
[16 Jun 2009 21:45]
Chris Khan
Here are descriptions of the entire data set, which is 25k rows: `column1` any 16 bytes `column2` any 16 bytes `column3` any 16 bytes `column4` any date, `column5` roughly flat distribution from 30 to 100 characters, 7 entries with ~300 characters, 3 with ~1300 characters, and 3 with ~1800 characters, utf-8, all single-byte characters, `column6` probably an exponential distribution from 4 to 1500 characters, only 600(~2%) of the values are over 200 bytes, utf-8, all single-byte characters, `column7` any 4 bytes, `column8` 0 bytes in every case I underestimated my maximum values, but the 1000 entries should have actually been from the lighter side of the data set. I hope this is descriptive enough. The .NET Connector version is 6.0.2.0. I only mentioned the system resources because there was a huge discrepancy. It appeared that ~1.5GB were being used when I've never seen it go over ~400MB. I admit that I don't usually pay too much attention to these numbers, but this was too much to go unnoticed. My system was crippled during this query, which hasn't happened before.
[16 Jun 2009 22:06]
Sveta Smirnova
Thank you for the feedback. Do you run same set of queries concurrently? I assume not. Also you specified you use prepared statements. Do you use 11k prepared statements to execute following code?
[16 Jun 2009 22:13]
Chris Khan
Sorry, I got caught up with the data set. There are several queries being run synchronously by a single thread, so there should be no concurrency issues, as far as I know. The SQL related to this query is sent in one single query. The 11k SET and the bulk INSERT are sent together.
[2 Jul 2009 16:05]
MySQL Verification Team
Thank you for the feedback. Are you able to provide a complete test case?. Thanks in advance.
[2 Jul 2009 17:06]
Chris Khan
What is missing?
[17 Sep 2009 8:57]
Sveta Smirnova
Test case used: let $i=2000; let $columns=; let $fields=; let $values=; while ($i) { --eval set @var$1='foobarfoobarfoobar' let $columns=`select concat('$columns','col$i,')`; let $values=`select concat('$values','@var$1,')`; let $fields=`select concat('$fields','col$i char(25),')`; --exec echo col$i, >> $MYSQL_TEST_DIR/var/tmp/columns --exec echo @var$1, >> $MYSQL_TEST_DIR/var/tmp/values --exec echo "col$i char(25)," >> $MYSQL_TEST_DIR/var/tmp/fields dec $i; } let $columns=`select trim(trailing ',' from '$columns')`; let $values=`select trim(trailing ',' from '$values')`; let $fields=`select trim(trailing ',' from '$fields')`; --eval create table t1($fields) --eval prepare psmt from "insert into t1 ($columns) values($values)" execute psmt;
[17 Sep 2009 9:03]
Sveta Smirnova
Thank you for the feedback. I can not repeat described behavior using test case provided. Please try with current version mysql-trunk available at https://launchpad.net/mysql and if problem still exists provide test case similar to one I did.
[17 Sep 2009 10:27]
Chris Khan
It was all one SQL string, one prepared statement that was executed one time. 1k rows and 11 columns of about 20 bytes per column. StringBuilder sb = new StringBuilder(); for(int i = 0; i < 1000; i++) { for(int j = 0; j < 11; j++) { sb.Append("SET @var11 = ?var11;"); } } for(int i = 0; i < 1000; i++) { sb.Append("INSERT INTO `table` (`col1`, ..., `col11`)"); sb.Append("VALUES (@var11, ..., @var111),"); sb.Append("(@var21, ..., @var211),"); } MySqlCommand comm = new MySqlCommand(sb.ToString(), connection); comm.Prepare(); MySqlDataReader reader = comm.ExecuteReader(); If you tried it all in one string similar to that, then letting it go sounds fine to me.
[13 Oct 2009 11:21]
Tonci Grgin
Hi Chris. Can you please check on my test case in Bug#29010 ([15 Jun 2007 10:01] Tonci Grgin) and attach similar one (self-sufficient, compact ...)?
[13 Oct 2009 20:00]
Chris Khan
Self-sufficiency and compactness seem to be competing qualities, and it appears we are having issues with self-sufficiency, so I am giving you some fairly complete code. If you're unable to use this .NET code to create a test case, I'm not sure that I will be able to help you unless you can identify specific shortcomings. StringBuilder sb = new StringBuilder(); for (int i = 0; i < 1000; i++) { for (int j = 0; j < 11; j++) { sb.Append(string.Concat("SET @var-", i.ToString(), "-", j.ToString(), " = ?var-", i.ToString(), "-", j.ToString(), ";\r\n")); } } sb.Append("INSERT INTO `table` ("); for(int i = 0; i < 11; i++) { sb.Append("`col", i.ToString(), "`"); if (i < 10) sb.Append(", "); } sb.Append(")\r\n"); sb.Append("VALUES "); for (int i = 0; i < 1000; i++) { sb.Append("("); for (int j = 0; j < 11; ji++) { sb.Append("@var-", i.ToString(), "-", j.ToString()); if (j < 10) sb.Append(", "); } sb.Append(")"); if (i < 999) sb.Append(",\r\n"); } MySqlCommand comm = new MySqlCommand(sb.ToString(), connection); comm.Prepare(); MySqlDataReader reader = comm.ExecuteReader();
[14 Oct 2009 11:20]
Tonci Grgin
Chris, don't get me wrong, but I must insist on working test case for two reasons: o If I do not see entire code that produces the error on your box I might not be able to reproduce the problem correctly. o If I have to guess what you did (like what you put in connection string, what is your table structure etc) I loose time which is not fair to other reporters waiting on their problems to be solved. Since this support is completely free, I think I'm entitled to above, as much as you're entitled to fair checking on my side, so we can all together make MySQL better. Now, your "test case" lacks basic info like table structure, connection string and so on. It is also full of typos... So I'm pasting new one which you can modify so that it fails on your box and paste full test back again: using (MySqlConnection cn = new MySqlConnection("DataSource=xx;Database=xx;UserID=xx;Password=xx;PORT=xx;logging=True;charset=utf8;Allow User Variables=true;Allow Batch=true")) { cn.Open(); try { MySqlCommand cmdCreateTable = new MySqlCommand("DROP TABLE IF EXISTS bug45512", cn); cmdCreateTable.ExecuteNonQuery(); cmdCreateTable.CommandText = "CREATE TABLE `bug45512`(`column1` CHAR(16) NOT NULL,`column2` CHAR(16) NOT NULL,`column3` CHAR(16) NOT NULL,`column4` CHAR(16),`column5` CHAR(16),`column6` CHAR(16),`column7` CHAR(16),`column8` CHAR(16)) ENGINE=InnoDB DEFAULT CHARSET=utf8;"; cmdCreateTable.ExecuteNonQuery(); StringBuilder sb = new StringBuilder(); for (int i = 0; i < 1000; i++) { for (int j = 1; j < 9; j++) { sb.Append(string.Concat("SET @var-", i.ToString(), "-", j.ToString(), " = ?var-", i.ToString(), "-", j.ToString(), ";\r\n")); } } sb.Append("INSERT INTO `bug45512` ("); for (int i = 1; i < 9; i++) { sb.Append(string.Concat("`column", i.ToString(), "`")); if (i < 8) sb.Append(", "); } sb.Append(")\r\n"); sb.Append("VALUES "); for (int i = 0; i < 1000; i++) { sb.Append("("); for (int j = 1; j < 9; j++) { sb.Append(string.Concat("@var-", i.ToString(), "-", j.ToString())); if (j < 8) sb.Append(", "); } sb.Append(")"); if (i < 999) sb.Append(",\r\n"); } MySqlCommand comm = new MySqlCommand(sb.ToString(), cn); comm.Prepare(); MySqlDataReader reader = comm.ExecuteReader(); } catch (Exception ex) { Console.Out.WriteLine(DateTime.UtcNow.ToLongTimeString() + " " + "Exception: " + ex.Message); throw; } }
[14 Oct 2009 11:46]
Chris Khan
I'm not running this server or code. It was a bug I encountered and didn't realize it was a bug until I noticed it working on a previous version. I've given an extensive amount of information, and if you don't have tools to quickly generate the appropriate code, or you do and the code didn't trigger a bug, then I guess it's not getting solved.
[15 Oct 2009 9:43]
Tonci Grgin
Ok, a functioning test case would be: using (MySqlConnection cn = new MySqlConnection("DataSource=**;Database=**;UserID=**;Password=**;PORT=**;logging=True;charset=utf8;Allow User Variables=true;Allow Batch=true; Ignore prepare=false; Use old syntax=false")) { cn.Open(); try { MySqlCommand cmdCreateTable = new MySqlCommand("DROP TABLE IF EXISTS bug45512", cn); cmdCreateTable.ExecuteNonQuery(); cmdCreateTable.CommandText = "CREATE TABLE `bug45512`(`column1` CHAR(16) NOT NULL,`column2` CHAR(16) NOT NULL,`column3` CHAR(16) NOT NULL,`column4` CHAR(16),`column5` CHAR(16),`column6` CHAR(16),`column7` CHAR(16),`column8` CHAR(16)) ENGINE=InnoDB DEFAULT CHARSET=utf8;"; cmdCreateTable.ExecuteNonQuery(); MySqlCommand comm = new MySqlCommand(); comm.Connection = cn; StringBuilder sb = new StringBuilder(); for (int i = 0; i < 1000; i++) { for (int j = 1; j < 9; j++) { comm.Parameters.AddWithValue("?var_" + i.ToString() + "_" + j.ToString(), i.ToString() + "_" + j.ToString()); sb.Append(string.Concat("SET @var_", i.ToString(), "_", j.ToString(), " = ?var_", i.ToString(), "_", j.ToString(), ";\r\n")); } } sb.Append("INSERT INTO `bug45512` ("); for (int i = 1; i < 9; i++) { sb.Append(string.Concat("`column", i.ToString(), "`")); if (i < 8) sb.Append(", "); } sb.Append(")\r\n"); sb.Append("VALUES "); for (int i = 0; i < 1000; i++) { sb.Append("("); for (int j = 1; j < 9; j++) { sb.Append(string.Concat("@var_", i.ToString(), "_", j.ToString())); //sb.Append(string.Concat("?var_", i.ToString(), "_", j.ToString())); if (j < 8) sb.Append(", "); } sb.Append(")"); if (i < 999) sb.Append(",\r\n"); } comm.CommandText = sb.ToString(); comm.Prepare(); MySqlDataReader reader = comm.ExecuteReader(); } catch (Exception ex) { Console.Out.WriteLine(DateTime.UtcNow.ToLongTimeString() + " " + "Exception: " + ex.Message); throw; } } but this is just really really stretched (mixing user variables with parameters *in prepared statements*) and I am not sure should it work at all...
[15 Oct 2009 9:46]
Tonci Grgin
Now, *real* test case taking only PS and parameters in account would be (notice switch in remarked lines): StringBuilder sb = new StringBuilder(); for (int i = 0; i < 1000; i++) { for (int j = 1; j < 9; j++) { comm.Parameters.AddWithValue("?var_" + i.ToString() + "_" + j.ToString(), i.ToString() + "_" + j.ToString()); //sb.Append(string.Concat("SET @var_", i.ToString(), "_", j.ToString(), " = ?var_", i.ToString(), "_", j.ToString(), ";\r\n")); } } sb.Append("INSERT INTO `bug45512` ("); for (int i = 1; i < 9; i++) { sb.Append(string.Concat("`column", i.ToString(), "`")); if (i < 8) sb.Append(", "); } sb.Append(")\r\n"); sb.Append("VALUES "); for (int i = 0; i < 1000; i++) { sb.Append("("); for (int j = 1; j < 9; j++) { //sb.Append(string.Concat("@var_", i.ToString(), "_", j.ToString())); sb.Append(string.Concat("?var_", i.ToString(), "_", j.ToString())); if (j < 8) sb.Append(", "); } sb.Append(")"); if (i < 999) sb.Append(",\r\n"); } and this one works as expected (ie no OOM errors). So, "Can't Repeat".
[15 Oct 2009 9:48]
Tonci Grgin
Not to mention "@var-", i.ToString(), "-",j.ToString() is not proper variable name (thus _).
[15 Oct 2009 10:26]
Chris Khan
It looks like no values are being given to the parameters in this test case. I encountered the bug when the average column width was about 12 bytes across the 8 columns. I don't think batch mode needs to be enabled either. I don't think I had any connection string items, except for maybe "convert zero datetime". I would keep an eye on memory consumption in case it's not quite getting high enough to trigger the error.
[15 Oct 2009 10:35]
Tonci Grgin
Chris, please do. As for "It looks like no values are being given to the parameters in this test case." it is not correct, see for (int i = 0; i < 1000; i++) { for (int j = 1; j < 9; j++) { comm.Parameters.AddWithValue("?var_" + i.ToString() + "_" + j.ToString(), i.ToString() + "_" + j.ToString()); //sb.Append(string.Concat("SET @var_", i.ToString(), "_", j.ToString(), " = ?var_", i.ToString(), "_", j.ToString(), ";\r\n")); } } so 8000 parameters with accompanying values *are* created.
[15 Oct 2009 10:45]
Chris Khan
I scanned it too quickly. That looks like an average width of 4 bytes. This might be a better line for adding values to the parameters, giving an average width of 13 bytes: comm.Parameters.AddWithValue("?var_" + i.ToString() + "_" + j.ToString(), i.ToString().PadLeft(6) + "_" + j.ToString().PadLeft(6));
[15 Oct 2009 10:49]
Tonci Grgin
Sure, let me see what happens. But, at the end, we *will* this or that RAM limitation, even though I have 8GB box... So the real question is: How muc is enough? There's already thousands of lines there.
[15 Oct 2009 10:51]
Tonci Grgin
Proposed change works as expected: -- Result -- Rows affected: 1000 Inserted Id (if any): 0 Increasing number of lines to insert.
[15 Oct 2009 10:54]
Tonci Grgin
-- Result -- Rows affected: 4000 Inserted Id (if any): 0 4000 rows with 13byte length, as proposed, still works as expected... I really think this is enough.
[15 Oct 2009 10:58]
Chris Khan
If the memory consumption for 4000 rows with a length of 13 bytes doesn't exceed 1GB, then it sounds like it's not repeatable.
[15 Oct 2009 11:06]
Tonci Grgin
In my case it has absolutely no visible impact on both host and client box...