Bug #30083 Falcon memory usage grows without bound
Submitted: 26 Jul 2007 20:04 Modified: 29 Aug 2007 13:56
Reporter: Kolbe Kegel
Status: Closed
Category:Server: Falcon Severity:S1 (Critical)
Version:6.0.1-alpha OS:Linux (openSUSE 10.0 2.6.18.2-34-default)
Assigned to: Christopher Powers Target Version:

[26 Jul 2007 20:04] Kolbe Kegel
Description:
When performing simple INSERT and SELECT operations against a Falcon table, memory usage
of mysqld seems to grow without bound, eventually using all memory on the system.

I am using all default settings for Falcon configuration parameters.

How to repeat:
Starting memory footprint:

$ ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5745 132972 21676 mysqld

CREATE TABLE `f1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=Falcon;

$ ps axo pid,vsz,rss,comm | grep 'mysqld$' 5876 190712 38104 mysqld

insert into f1 values ();
insert into f1 select null from f1;
insert into f1 select null from f1;
insert into f1 select null from f1;
...

$ mysql -e 'insert into f1 select null from f1;'; ps axo pid,vsz,rss,comm | grep
'mysqld$'
 5876 197248 46252 mysqld

$ mysql -e 'insert into f1 select null from f1;'; ps axo pid,vsz,rss,comm | grep
'mysqld$'
 5876 203768 54196 mysqld

$ mysql -e 'insert into f1 select null from f1;'; ps axo pid,vsz,rss,comm | grep
'mysqld$'
 5876 220652 70016 mysqld

$ mysql -e 'insert into f1 select null from f1;'; ps axo pid,vsz,rss,comm | grep
'mysqld$'
 5876 252372 101612 mysqld

-- continue to do this and watch memory usage

$ ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5745 132972 21676 mysqld

-- then restart mysqld,
-- observe initial memory usage, 
-- and do this:

select count(*) from f1;

-- observe growth in memory usage

$ ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5383 190720 38120 mysqld
$ mysql -e 'select count(*) from f1'
+----------+
| count(*) |
+----------+
|  6291456 | 
+----------+
$ ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5383 1157392 1007068 mysqld

--- and then, check out this one:

$ mysql -e 'truncate f1'
$ ps axo pid,vsz,rss,comm | grep 'mysqld$'
 5383 2232200 2021116 mysqld

(Then mysql crashes when I try to shut it down, but I suppose that's probably some
unrelated issue.)

Suggested fix:
Falcon absolutely cannot let its memory usage grow without bound. Eventually, it caused my
system to become unresponsive.

Also, it's critical that a simple SELECT not affect memory usage in this way, due to
common sense and obvious DoS side-effects.
[16 Aug 2007 12:55] Hakan Kuecuekyilmaz
Chris Powers implemented a memory limit for Falcon. Please see whether it helps in you
case.

Thanks,

Hakan
[16 Aug 2007 21:45] Christopher Powers
Changesets 1.2666.1.1 and 1.2671

Certain operations can cause the record cache to grow without bound, possibly consuming
all availble memory. To prevent this, the engine now sets a fixed upper limit to the
record cache size. When the record cache memory is exhausted, Falcon will throw an an
out-of-memory exception and abort the current operation.

The record cache size is controlled by two parameters, falcon_max_record_memory and
falcon_min_record_memory.

falcon_max_record_memory defines the maximum record cache size. The default of 0 means
that Falcon will assign the record cache size according to:

   max_record_memory = MAX(available physical memory*0.7 - (page cache + serial log),
250MB)

In other words, by default, Falcon will allocate up to 70% of available physical memory
for the record cache, capped at 250MB. Values greater than 0 may be used, with a minimum
of 5000000.

falcon_min_record_memory defines the minimum amount of record data that will be stored
within the record cache. The default of 0 means that Falcon will assign a lower record
memory limit as follows:

   min_record_memory = max_record_memory / 4

Values greater than 0 and less than falcon_max_record_memory be used.

The values for falcon_max_record_memory and falcon_min_record_memory can be viewed via
"show global variables".

When the record cache reaches the "scavenge threshold", unused records will be scavenged
and removed from memory. The scavenger thread activates every 30 seconds. The scavenge
threshold is computed as:

   min record memory + ((max record memory - min record memory) / 5)

For example, if the max record cache is 400MB and the min record cache is 100MB, then the
scavenger will begin removing unused  records when the data in the record cache exceeds
250MB.

Previously, the scavenge threshold was set to falcon_max_record_memory. The reason for a
lower scavenge threshold is that the record cache can rapidly grow far above the max
record cache size before the next scavenge operation.

Record memory is checked prior to each insert, update or delete operation. If the record
cache memory is exhausted, then Falcon will throw an exception and the operation will be
aborted.

It is possible for record memory to temporarily exceed falcon_max_record_memory without
error because the record cache is checked only prior to insert, update and delete
operations, and because the scavenger process only runs every 30 seconds. However, Falcon
will always return an error if all physical memory is exhausted.

Notes:

Apart from the record cache, Falcon uses other dynamic memory that may grow during
memory-intensive operations.

Scavenger activity can be viewed by enabling the console and setting the scavenge flag
(512) in falcon_log_mask.

The default max record memory is allocated according to the *available* physical memory at
the time the engine is initialized, not the *total* physical memory. Therefore, processes
active during engine initialization may affect the computed size of the record cache.

The current record cache size can be be viewed from the information schema:

   SELECT * FROM information_schema.falcon_record_cache_detail;

The total memory used by Falcon, not including the record cache or memory allocated by the
server, is given by:

   SELECT * FROM information_schema.falcon_system_memory_summary;
[16 Aug 2007 22:05] Christopher Powers
Correction:

In the example, the scavenge threshold should be computed as

   min record memory + ((max record memory - min record memory) / 5)

   100 + (400 - 100)/5 = 160MB
[20 Aug 2007 18:52] Christopher Powers
Another correction. The default maximum record cache size formula is determined with a MIN
and not a MAX function:

max_record_memory = MIN(available physical memory*0.7 - (page cache + serial log),
250MB)
[23 Aug 2007 22:39] Christopher Powers
The Falcon memory management parameters have been changed as follows:

falcon_record_memory_max (formerly falcon_max_record_memory) -  A number greater than 5
that represents the fixed upper limit of the size of the record cache. Default is 250M.

falcon_record_scavenge_floor (formerly falcon_min_record_memory) - A number between 10 and
100 that represents the percentage of falcon_record_memory_max that will remain after a
scavenge run. Default is 33.

falcon_record_scavenge_threshold - A number between 10 and 100 that represents the
percentage of falcon_record_memory_max that will cause the scavenger thread to start
removing old generations of records from the record cache. Default is 67.

falcon_initial_allocation - A number larger than 10 that indicates the amount of 
disk space to preallocate for a falcon_user.fts. Default is 1000M.

falcon_data_extent - A number between 1 and 100 that indicates the percent of the current
size of falcon_user.fts to use as the size of the next extension to 
the file. Default is 10.

falcon_disable_fsync - If true, then disable the periodic fsync operation. Default is
false.
[23 Aug 2007 22:43] Christopher Powers
Changes above introduced in changeset 1.2697.
[25 Aug 2007 1:35] Christopher Powers
The memory control parameters have been changed again.

These parameters are no longer supported:

falcon_max_record_memory
falcon_min_record_memory

Three parameters now control the record cache and record scavenging:

1. falcon_record_memory_max - A number greater than 5 that represents the fixed upper
limit of the size of the record cache. Default is 250M. This can also be set dynamically,
for example:

  set global falcon_record_memory_max = 500m;

2. falcon_record_scavenge_threshold - A number between 10 and 100 that represents the
percentage of falcon_record_memory_max that will cause the scavenger thread to start
removing old generations of records from the record cache. Default is 67. This can also be
set dynamically, for example:

  set global falcon_record_scavenge_threshold = 75;

3. falcon_record_scavenge_floor - A number between 10 and 90 that is a percentage of
falcon_record_scavenge_threshold representing the amount of data in the record cache that
will remain after a scavenge run. Default is 50. This can also be set dynamically, for
example:

  set global falcon_record_scavenge_floor = 40;

The minimum size of the record cache, i.e. the amount of data that will remain after
scavenging, can be computed as:

  falcon_record_memory_max * falcon_record_scavenge_threshold/100 *
falcon_record_scavenge_floor/100

The record scavenger awakens every 30 seconds to remove unused records from the record
cache. If at any time record memory appears exhausted during memory allocation, a scavenge
cycle will be forced and the allocation retried. The scavenger is now synchronized to
avoid parallel scavenging and unnecessary multiple scavenges.

Two parameters were added to control file system performance:

4. falcon_initial_allocation - A number larger than 10 that indicates the amount of disk
space to preallocate for a falcon_user.fts. Default is 0.

5. falcon_disable_fsync - If true, then disable the periodic fsync operation. Default is
false. This can also be set dynamically as:

  set global falcon_disable_fsync = true;
[29 Aug 2007 13:49] MC Brown
A note has been added to the 6.0.1 changelog. 

The documentation has also been updated to reflect all of the configuration parameter
changes.
[29 Aug 2007 13:56] MC Brown
Closing.