Bug #30083 Falcon memory usage grows without bound
Submitted: 26 Jul 2007 18:04 Modified: 29 Aug 2007 11:56
Reporter: Kolbe Kegel Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0.1-alpha OS:Linux (openSUSE 10.0 2.6.18.2-34-default)
Assigned to: Christopher Powers CPU Architecture:Any

[26 Jul 2007 18:04] Kolbe Kegel
Description:
When performing simple INSERT and SELECT operations against a Falcon table, memory usage of mysqld seems to grow without bound, eventually using all memory on the system.

I am using all default settings for Falcon configuration parameters.

How to repeat:
Starting memory footprint:

$ ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5745 132972 21676 mysqld

CREATE TABLE `f1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=Falcon;

$ ps axo pid,vsz,rss,comm | grep 'mysqld$' 5876 190712 38104 mysqld

insert into f1 values ();
insert into f1 select null from f1;
insert into f1 select null from f1;
insert into f1 select null from f1;
...

$ mysql -e 'insert into f1 select null from f1;'; ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5876 197248 46252 mysqld

$ mysql -e 'insert into f1 select null from f1;'; ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5876 203768 54196 mysqld

$ mysql -e 'insert into f1 select null from f1;'; ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5876 220652 70016 mysqld

$ mysql -e 'insert into f1 select null from f1;'; ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5876 252372 101612 mysqld

-- continue to do this and watch memory usage

$ ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5745 132972 21676 mysqld

-- then restart mysqld,
-- observe initial memory usage, 
-- and do this:

select count(*) from f1;

-- observe growth in memory usage

$ ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5383 190720 38120 mysqld
$ mysql -e 'select count(*) from f1'
+----------+
| count(*) |
+----------+
|  6291456 | 
+----------+
$ ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5383 1157392 1007068 mysqld

--- and then, check out this one:

$ mysql -e 'truncate f1'
$ ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5383 2232200 2021116 mysqld

(Then mysql crashes when I try to shut it down, but I suppose that's probably some unrelated issue.)

Suggested fix:
Falcon absolutely cannot let its memory usage grow without bound. Eventually, it caused my system to become unresponsive.

Also, it's critical that a simple SELECT not affect memory usage in this way, due to common sense and obvious DoS side-effects.
[16 Aug 2007 10:55] Hakan Küçükyılmaz
Chris Powers implemented a memory limit for Falcon. Please see whether it helps in you case.

Thanks,

Hakan
[16 Aug 2007 19:45] Christopher Powers
Changesets 1.2666.1.1 and 1.2671

Certain operations can cause the record cache to grow without bound, possibly consuming all availble memory. To prevent this, the engine now sets a fixed upper limit to the record cache size. When the record cache memory is exhausted, Falcon will throw an an out-of-memory exception and abort the current operation.

The record cache size is controlled by two parameters, falcon_max_record_memory and falcon_min_record_memory.

falcon_max_record_memory defines the maximum record cache size. The default of 0 means that Falcon will assign the record cache size according to:

   max_record_memory = MAX(available physical memory*0.7 - (page cache + serial log), 250MB)

In other words, by default, Falcon will allocate up to 70% of available physical memory for the record cache, capped at 250MB. Values greater than 0 may be used, with a minimum of 5000000.

falcon_min_record_memory defines the minimum amount of record data that will be stored within the record cache. The default of 0 means that Falcon will assign a lower record memory limit as follows:

   min_record_memory = max_record_memory / 4

Values greater than 0 and less than falcon_max_record_memory be used.

The values for falcon_max_record_memory and falcon_min_record_memory can be viewed via "show global variables".

When the record cache reaches the "scavenge threshold", unused records will be scavenged and removed from memory. The scavenger thread activates every 30 seconds. The scavenge threshold is computed as:

   min record memory + ((max record memory - min record memory) / 5)

For example, if the max record cache is 400MB and the min record cache is 100MB, then the scavenger will begin removing unused  records when the data in the record cache exceeds 250MB.

Previously, the scavenge threshold was set to falcon_max_record_memory. The reason for a lower scavenge threshold is that the record cache can rapidly grow far above the max record cache size before the next scavenge operation.

Record memory is checked prior to each insert, update or delete operation. If the record cache memory is exhausted, then Falcon will throw an exception and the operation will be aborted.

It is possible for record memory to temporarily exceed falcon_max_record_memory without error because the record cache is checked only prior to insert, update and delete operations, and because the scavenger process only runs every 30 seconds. However, Falcon will always return an error if all physical memory is exhausted.

Notes:

Apart from the record cache, Falcon uses other dynamic memory that may grow during memory-intensive operations.

Scavenger activity can be viewed by enabling the console and setting the scavenge flag (512) in falcon_log_mask.

The default max record memory is allocated according to the *available* physical memory at the time the engine is initialized, not the *total* physical memory. Therefore, processes active during engine initialization may affect the computed size of the record cache.

The current record cache size can be be viewed from the information schema:

   SELECT * FROM information_schema.falcon_record_cache_detail;

The total memory used by Falcon, not including the record cache or memory allocated by the server, is given by:

   SELECT * FROM information_schema.falcon_system_memory_summary;
[16 Aug 2007 20:05] Christopher Powers
Correction:

In the example, the scavenge threshold should be computed as

   min record memory + ((max record memory - min record memory) / 5)

   100 + (400 - 100)/5 = 160MB
[20 Aug 2007 16:52] Christopher Powers
Another correction. The default maximum record cache size formula is determined with a MIN and not a MAX function:

max_record_memory = MIN(available physical memory*0.7 - (page cache + serial log),
250MB)
[23 Aug 2007 20:39] Christopher Powers
The Falcon memory management parameters have been changed as follows:

falcon_record_memory_max (formerly falcon_max_record_memory) -  A number greater than 5 that represents the fixed upper limit of the size of the record cache. Default is 250M.

falcon_record_scavenge_floor (formerly falcon_min_record_memory) - A number between 10 and 100 that represents the percentage of falcon_record_memory_max that will remain after a scavenge run. Default is 33.

falcon_record_scavenge_threshold - A number between 10 and 100 that represents the percentage of falcon_record_memory_max that will cause the scavenger thread to start removing old generations of records from the record cache. Default is 67.

falcon_initial_allocation - A number larger than 10 that indicates the amount of 
disk space to preallocate for a falcon_user.fts. Default is 1000M.

falcon_data_extent - A number between 1 and 100 that indicates the percent of the current size of falcon_user.fts to use as the size of the next extension to 
the file. Default is 10.

falcon_disable_fsync - If true, then disable the periodic fsync operation. Default is false.
[23 Aug 2007 20:43] Christopher Powers
Changes above introduced in changeset 1.2697.
[24 Aug 2007 23:35] Christopher Powers
The memory control parameters have been changed again.

These parameters are no longer supported:

falcon_max_record_memory
falcon_min_record_memory

Three parameters now control the record cache and record scavenging:

1. falcon_record_memory_max - A number greater than 5 that represents the fixed upper limit of the size of the record cache. Default is 250M. This can also be set dynamically, for example:

  set global falcon_record_memory_max = 500m;

2. falcon_record_scavenge_threshold - A number between 10 and 100 that represents the percentage of falcon_record_memory_max that will cause the scavenger thread to start removing old generations of records from the record cache. Default is 67. This can also be set dynamically, for example:

  set global falcon_record_scavenge_threshold = 75;

3. falcon_record_scavenge_floor - A number between 10 and 90 that is a percentage of falcon_record_scavenge_threshold representing the amount of data in the record cache that will remain after a scavenge run. Default is 50. This can also be set dynamically, for example:

  set global falcon_record_scavenge_floor = 40;

The minimum size of the record cache, i.e. the amount of data that will remain after scavenging, can be computed as:

  falcon_record_memory_max * falcon_record_scavenge_threshold/100 * falcon_record_scavenge_floor/100

The record scavenger awakens every 30 seconds to remove unused records from the record cache. If at any time record memory appears exhausted during memory allocation, a scavenge cycle will be forced and the allocation retried. The scavenger is now synchronized to avoid parallel scavenging and unnecessary multiple scavenges.

Two parameters were added to control file system performance:

4. falcon_initial_allocation - A number larger than 10 that indicates the amount of disk space to preallocate for a falcon_user.fts. Default is 0.

5. falcon_disable_fsync - If true, then disable the periodic fsync operation. Default is false. This can also be set dynamically as:

  set global falcon_disable_fsync = true;
[29 Aug 2007 11:49] MC Brown
A note has been added to the 6.0.1 changelog. 

The documentation has also been updated to reflect all of the configuration parameter changes.
[29 Aug 2007 11:56] MC Brown
Closing.