Bug #64747 LOAD DATA INFILE causes server crash on large rows
Submitted: 23 Mar 2012 16:43 Modified: 18 May 2012 0:59
Reporter: Omar Mehmood Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.1.56 OS:Linux (RHEL 5.6)
Assigned to: CPU Architecture:Any
Tags: LOAD DATA INFILE, server crash

[23 Mar 2012 16:43] Omar Mehmood
Description:
When executing LOAD DATA INFILE on a large row (e.g. 627mb), the MySQL server uses up all the free physical and virtual (swap) memory on the machine.  The linux oom killer correctly decides to kill the offending main mysql process, which causes the server to unexpectedly go down (and restart automatically).  While the MySQL server is usable again after the restart, all the various processes executing against the MySQL server have been killed as a side effect.

How to repeat:
Create a database schema.
Create an equivalent test set of data with a particular line delimiter (e.g. '\n') in the order of 600-700mb.
Execute LOAD DATA INFILE with LINES TERMINATED BY '\r\n' (a different delimiter).
Use top to monitor free physical and virtual memory.
Observe that within a few seconds of execution, all free physical memory is taken up and within a few minutes, the swap space is exhausted, at which point, the oom killer kills the main mysql process.

Obviously, there is a user error here of choosing the incorrect delimiter.  However, this does highlight the issue of the LOAD DATA INFILE using a lot more memory than it should actually need.  The above steps were performed a machine that had 36gb of free physical memory and 49gb of free virtual memory.  Processing a single 600mb row should not expand to using well beyond 36gb of free physical memory.

Suggested fix:
Review the code corresponding to LOAD DATA INFILE for memory usage when processing large rows and make the appropriate changes.  If anything, at a minimum, it should at least attempt to calculate the memory necessary to process the command and abort if it exceeds the memory available on the machine.
[18 Apr 2012 0:59] Sveta Smirnova
Thank you for the report.

In my test mysqld used about 300MB physical and 700MB virtual memory for load of 700 MB file, but I can not repeat described problem:

>  The above steps were performed a machine that had 36gb of free physical
> memory and 49gb of free virtual memory.  Processing a single 600mb row should
> not expand to using well beyond 36gb of free physical memory.

Please provide your configuration file, output of SHOW CREATE TABLE for problem table and output of top before and after LOAD DATA statement.
[18 May 2012 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".