Bug #42505 Falcon: Record backlogging not enabled by the Scavenger
Submitted: 31 Jan 2009 23:22 Modified: 26 May 2010 17:47
Reporter: Christopher Powers Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:6.0-falcon-team OS:Any
Assigned to: Christopher Powers CPU Architecture:Any
Tags: F_BACKLOG, Memory, scavenger

[31 Jan 2009 23:22] Christopher Powers
Description:
The Scavenger fails to enable the backlogging facility when the available record cache memory drops below the record cache floor.

Additionally, the Backlog class uses a tablespace id of -1 which hashes to the -1 slot in SerialLog::tableSpaces[]. From a C++ standpoint, negative array subscripts are not illegal, but the array is not sized to accommodate elements below 0, so a tablespace id of -1 can result in memory corruption.

Finally, once the Backlog is enabled it is never disabled. This, too, should be fixed.

How to repeat:
Set these parameters:

falcon_record_memory_max = 100 MB,
falcon_record_chill_threshold = 2,

Run the testcase from Bug#36442 or any testcase that fills the record cache beyond what the scavenger can handle.

Notes:

1. With the current implementation, backlogging is *enabled* by the Scavenger, but is *triggered* by record chilling.

2. Backlogging is enabled when the Scavenger sets a low memory condition. This occurs when the Scavenger cannot reduce the record cache memory below falcon_record_memory_max * falcon_record_scavenge_threshold/100.

3. When backlogging is enabled, a message will appear in the Falcon record console.

4. Backlogging is triggered by record chilling during a low memory condition..

5. When backlogging is triggered, backlogging messages will appear in the Falcon console.

Currently, record backlogging is very tricky to test. I will post a backlog test script once backlogging is fixed.

Suggested fix:
1. Use the sequence manager to assign a tablespace id to the Backlogger.

2. Use the correct Scavenger memory totals to enable Backlogging.

3. Disable Backlogging when the record cache settles to a nominal level.
[31 Jan 2009 23:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/64759

2990 Christopher Powers	2009-01-31
      Bug #42505, "Falcon: Record backlogging not enabled by the Scavenger"
      
      Fixed several issues with record backlogging.
[2 Feb 2009 6:33] Kevin Lewis
This patch implements the first two suggested fixes forthis bug.  The third suggestion, turn off backlogging, is not implemented.  Chris should decide if this change is complete or if a new bug should be opened to improve backlogging more.

Note that by comparing 'if (recordScavenge.retiredActiveMemory > recordScavengeFloor)', this patch uses a moving target.  This is because retiredActiveMemory is the actual memory after the scavenge cycle.  This actual memory level went down during the scavenge while also going up because of new records allocated.  It does not necessarily reflect how successful the scavenge cycle was.  In fact, this test will almost always succeed after the first scavenge cycle that is over the threshold.  The cycle will only try to move the active memory sown to the floor.  But if any records are allocated at all, chances are, retiredActiveMemory will be higher than the floor.  

I think the following code should be removed from RecordVersion::retire()

-	// Cannot retire this record.  Add up remaining space.
-
-	recordScavenge->recordsRemaining++;
-	recordScavenge->spaceRemaining += getMemUsage();
-
-	for (Record *prior = getPriorVersion(); prior; prior = prior->getPriorVersion())
-		{
-		++recordScavenge->recordsRemaining;
-		recordScavenge->spaceRemaining += prior->getMemUsage();
-		}

Then spaceRemaining should be calcuated based on just how much was actually done during the cycle.  Oprofile results have shown a surprising amount of time spent calling MemMgr::getMemUsage().  The calls above are not needed when it can be calculated as below.

-	recordsRemaining = totalRecords - recordsPruned;
+	recordsRemaining = totalRecords - recordsPruned - recordsRetired;
-	spaceRemaining = totalRecordSpace - spacePruned;
+	spaceRemaining = totalRecordSpace - spacePruned - spaceRetired;

If the above is implemented, the following change from the patch will not be needed;
@@ -1816,7 +1816,7 @@ void Database::scavengeRecords(void)
-	if (recordScavenge.spaceRemaining > recordScavengeFloor)
+	if (recordScavenge.retiredActiveMemory > recordScavengeFloor)
 		setLowMemory();

Then the code will be faster, without making a bunch of calls to getMemUsage(), and it would be more accurate to turn backlogging on.
[3 Feb 2009 2:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/64959

2995 Christopher Powers	2009-02-02
      Bug #42505 "Falcon: Record backlogging not enabled by the Scavenger"
      
      Limit updateCardinalities() to once every 20 scavenge cycles. This
      function is too expensive to call every scavenge cycle. 
      
      Made Changes per Kevin's comments in the bug report, with one
      exception: Database::scavengeRecords() still determines low
      memory with retiredActiveMemory > recordScavengeFloor. Using 
      spaceRemaining > recordScavengeFloor does not enable backlogging
      when needed (confirmed against falcon_bug_36294-big.test).
[3 Feb 2009 17:13] Ann Harrison
There are a number of places in the code that treat the backlog
tablespace differently based on its illegal number.  Do you really
need to change that?
[5 Feb 2009 5:37] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65282

2999 Christopher Powers	2009-02-04
      Bug #42505 "Falcon: Record backlogging not enabled by the Scavenger"
      
      Use negative tablespace ids for reserved tablespaces
[5 Feb 2009 6:17] Kevin Lewis
This solution works to keep the negative tablespaceID for the backlog tablespace that 'other code' relies on.  And it has a slot within the hash table.  New patch is also approved.
[13 Feb 2009 7:23] Bugs System
Pushed into 6.0.10-alpha (revid:alik@sun.com-20090211182317-uagkyj01fk30p1f8) (version source revid:christopher.powers@sun.com-20090205053442-3d2okbitpbc5zx1q) (merge vers: 6.0.10-alpha) (pib:6)