MySQL Bugs: #2438: Runaway memory consumption

Bug #2438	Runaway memory consumption
Submitted:	18 Jan 2004 16:42	Modified:	2 Feb 2004 9:34
Reporter:	T.J.	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server	Severity:	S2 (Serious)
Version:	4.1.1	OS:	Linux (Mandrake Linux 9.2/WindowsXP)
Assigned to:	Victor Vagin	CPU Architecture:	Any

Description:
This problem has surfaced on both a WindowsXP version of 4.1 and Linux version 
for me. I use a perl script to iterate through about 14 gigabytes of text 
files and enters each line into different tables in a mysql database depending 
on the content. In 4.0.17 (and earlier) I accomplished this through use of a 
select count(*) where name = 'yada'; etc and so on, to make sure there were no 
duplicate rows. After the select count(*) if the result was 0 it would proceed 
to do an insert into TABLE (column1,column2) VALUES ('value1','value2'); and 
the file that was opened fell into one of three categories. Depending on which 
category the insert would have anywhere from 70 to 140 columns. If the value 
count value were 1, it would instead do an UPDATE SET column = 'whatever' with 
a similar column count as the insert command. Now, entering all of that info 
into MySQL <= 4.0.17 worked (for the most part...having trouble with the 
count/update function being reliable, hence why I wanted to switch over to 4.1 
with the ON DUPLICATE UPDATE). Which brings me to the single statement that 
has been causing the problem. It is just a "Insert INTO TABLE ('COLUMNS') 
VALUES('VALUES') ON DUPLICATE KEY UPDATE name = 'whatever', temp = 'new'" 
It begins iterating through the first files and it runs into the first fairly 
large text file, which is 29 megabytes. 4.0.17 runs through this file fairly 
quickly, but not 4.1. In fact, on this file alone about 1.8 gigabytes of 
memory is consumed. After consuming 1.8 gigs, it finishes the file and moves 
on, until it hits the next large file, and fairly quickly the script dies 
complaining that there is no memory left and it gives me a byte count of how 
many bytes it needed (I have 2 gb of ram, btw). And yes, I have the 
close(FILE)'s in place for the perl script and nothing I can see within the 
perl script would cause mysql to gobble the memory and not return it, 
especially considering it was basically running through the same thing on 
4.0.17 smoothly. On a side note, all of the columns have been set to 
VARCHAR(100). I just thought of that and perhaps it may be useful. 

How to repeat:
Well, the only time I have seen this error is with this script and with these 
large files. I guess the best way to try and re-create the problem would be to 
create some very large text files that are delimited and just fill 'em with 
garbage for testing purposes and use the script to insert them into MySQL 4.1 
with the INSERT ... ON DUPLICATE KEY UPDATE as the query. The standard DBI was 
used so I don't believe any obscure cause besides MySQL would be the cause. 
But, I could be wrong. 

Suggested fix:
I am probably just going to downgrade back to 4.0.17 for now and go back to 
the select count(*), insert, update montage. But if anyone finds a better fix 
than that I would be interested to hear it :)

what does you script do ? can you attach it to the bugreport (using  [Files] tab) ?

One of three necessary scripts. This will create the table. Must already have a database called testing created for it to work.

Attachment: CreateTable.pl (text/x-perl), 3.60 KiB.

One of three necessary scripts. This will create the big text file for the main script. Creates a text file that is ~ 26 mb.

Attachment: makebig.pl (text/x-perl), 925 bytes.

One of three necessary scripts. This is the main script file.

Attachment: test.pl (text/x-perl), 2.62 KiB.

Due to the nature of the scripts and the sensitivity of the data I was forced 
to re-create the script for these testing purposes. I tested them on a Windows 
XP machine running 4.1, as it happened the same as before I assume that the 
effect will be similar on Linux as my original script. Within the test script 
I have also included the SELECT count(*)...INSERT...UPDATE sequence commented 
out. After confirming the runaway memory consumption with the INSERT...ON 
DUPLICATE KEY you may try commenting that sequence out and uncommenting the 
other sequence to prove that it is in fact the mysql command out of control, 
and not the script. Oh, and before I mentioned that DBI was necessary for 
perl, apparently DBI doesn't always contain DBD::MySQL, so make sure you also 
have that installed as well.

Also, in case it wasn't obvious in my previous post, I have included the three  
files necessary to re-create the bug in the files section.

Verified against 4.1.2/Linux using the supplied scripts.  I also used the queries which the script produces directly through the mysql client with the same result.

Thank you for the test case!

Thak you very much for such detailed bug-report

Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Is this bug present in 4.0.18, or only in 4.1.1?