Bug #43963 Falcon 305 record memory exhausted during LOAD DATA
Submitted: 30 Mar 2009 15:41 Modified: 26 May 2010 17:52
Reporter: Philip Stoev Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:6.0-falcon-team OS:Any
Assigned to: Christopher Powers CPU Architecture:Any
Tags: F_MEMORY
Triage: Triaged: D1 (Critical)

[30 Mar 2009 15:41] Philip Stoev
Description:
Several internal Falcon customers have reported being unable to use LOAD DATA to load a database of more than a few million rows. One specific example is contained in the private comments in that bug.

How to repeat:
See private comments.

Suggested fix:
Loading data via LOAD DATA should not be constrained by available memory and arbitrarily large data sets should be loadable using a bounded ammount of memory.
[30 Mar 2009 18:06] Philip Stoev
On a 6Gb machine with 2Gb for page cache and 2Gb of record cache, LOAD DATA made no progress beyond 26 million rows and the machine started trashing badly.

top:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
18257 philips   20   0 7851m 5.4g 3612 S 131.1 96.3 119:22.78 mysqld

free:
[philips@fedora10 gentest]$ free
             total       used       free     shared    buffers     cached
Mem:       5871192    5838124      33068          0       2892       7404
-/+ buffers/cache:    5827828      43364
Swap:      5931000    2971712    2959288

SHOW TABLE STATUS:
           Name: ontime_falcon
         Engine: Falcon
        Version: 10
     Row_format: Fixed
           Rows: 26590000
 Avg_row_length: 0
    Data_length: 10000
Max_data_length: 0
   Index_length: 0
      Data_free: 0
 Auto_increment: 601413 <<<<<<<<< NOTE THIS, WHY SO SMALL?
    Create_time: NULL
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL

SHOW PROCESSLIST:
*************************** 1. row ***************************
     Id: 3
   User: root
   Host: localhost
     db: flightstats
Command: Query
   Time: 3104
  State: NULL
   Info: load data infile 'ontime' into table ontime_falcon

DESCRIBE:

mysql> describe ontime_falcon;
+-------------+----------------------+------+-----+---------+----------------+
| Field       | Type                 | Null | Key | Default | Extra          |
+-------------+----------------------+------+-----+---------+----------------+
| id          | int(10) unsigned     | NO   | MUL | NULL    | auto_increment |
| carrier     | char(2)              | NO   | MUL | NULL    |                |
| origin      | char(3)              | NO   | MUL | NULL    |                |
| destination | char(3)              | NO   | MUL | NULL    |                |
| flight_num  | char(5)              | NO   |     | NULL    |                |
| flight_time | smallint(5) unsigned | NO   |     | NULL    |                |
| tail_num    | char(8)              | NO   | MUL | NULL    |                |
| dep_time    | datetime             | NO   |     | NULL    |                |
| arr_time    | datetime             | NO   |     | NULL    |                |
| dep_delay   | smallint(6)          | NO   |     | NULL    |                |
| arr_delay   | smallint(6)          | NO   |     | NULL    |                |
| taxi_out    | smallint(6)          | NO   |     | NULL    |                |
| taxi_in     | smallint(6)          | NO   |     | NULL    |                |
| distance    | smallint(5) unsigned | NO   |     | NULL    |                |
| cancelled   | enum('Y','N')        | NO   |     | NULL    |                |
| diverted    | enum('Y','N')        | NO   |     | NULL    |                |
+-------------+----------------------+------+-----+---------+----------------+
16 rows in set (0.38 sec)

mysql> select * from ontime_falcon limit 1\G
*************************** 1. row ***************************
         id: 1
    carrier: DL
     origin: ATL
destination: ABE
 flight_num: 1408
flight_time: 0
   tail_num: N903DE
   dep_time: 2000-01-01 19:03:00
   arr_time: 2000-01-01 00:00:00
  dep_delay: 63
  arr_delay: 0
   taxi_out: 13
    taxi_in: 0
   distance: 692
  cancelled: N
   diverted: Y
1 row in set (0.19 sec)
[30 Mar 2009 18:12] Philip Stoev
To reproduce:

#1 download the datadir in myisam format from (1Gb compressed)

http://dl.flightstats.us/files/data/flightstats-20060216.tar.gz

#2 create the merge tables by downloading

http://dl.flightstats.us/files/schema/ontime-merge.sql

and editing TYPE= to be ENGINE=

#3 Dump the database from MyISAM

select * from ontime_all into outfile 'ontime'; 

#4 Create a falcon table like ontime_all and load the data:

load data fromfile 'ontime' into table ontime_falcon;
[30 Apr 2009 17:08] Christopher Powers
Detailed instructions:

1. Download the datadir in MyISAM format from (1.3 GB compressed)

   http://dl.flightstats.us/files/data/flightstats-20060216.tar.gz

2. Extract into MySQL data directory:

   cd ./mysql/var/mysqld.1/data/ (or wherever)
   tar -xvf flightstats-20060216.tar.gz
   Creates ./flightstats directory (and "flightstats" mysql database)

3. Download merge table script: 

   http://dl.flightstats.us/files/schema/ontime-merge.sql

4. Change TYPE=MERGE to ENGINE=MERGE

5. Create merge tables:

   mysql> use flightstats;
   mysql> source ontime-merge.sql

6. Dump the database from MyISAM:

   mysql> select * from ontime_all into outfile 'ontime_all';

7. Create a Falcon table like ontime_all:

   mysql> create table ontime_all_falcon like ontime_all;
   mysql> alter table ontime_all_falcon engine=falcon;

8. Load the data:

   mysql> load data infile 'ontime_all' into table ontime_all_falcon;