Bug #42340 Falcon assertion (oldestVisible->state != recLock) in RecordLeaf::pruneRecords
Submitted: 26 Jan 2009 9:27 Modified: 15 May 2009 15:17
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0-falcon-team OS:Any
Assigned to: Kevin Lewis CPU Architecture:Any
Tags: F_RECORD TREE
Triage: Triaged: D1 (Critical)

[26 Jan 2009 9:27] Philip Stoev
Description:
When executing a simple random workload in 4 threads, Falcon quickly asserted as follows:

[Falcon] Error: assertion (oldestVisible->state != recLock) failed at line 143 in file RecordLeaf.cpp

#7  0x0000000000f2d77d in Error::assertionFailed (
    text=0x18b9198 "oldestVisible->state != recLock",
    fileName=0x18b9185 "RecordLeaf.cpp", line=143) at Error.cpp:78
#8  0x0000000000fa1e49 in RecordLeaf::pruneRecords (this=0x7f2297922c60,
    table=0x7f2297c5e338, base=1, recordScavenge=0x7f22954cbb30)
    at RecordLeaf.cpp:143
#9  0x0000000001099c3e in RecordGroup::pruneRecords (this=0x7f2297922910,
    table=0x7f2297c5e338, base=0, recordScavenge=0x7f22954cbb30)
    at RecordGroup.cpp:122
#10 0x0000000000e7682b in Table::pruneRecords (this=0x7f2297c5e338,
    recordScavenge=0x7f22954cbb30) at Table.cpp:1907
#11 0x0000000000eff937 in Database::pruneRecords (this=0x7f2297ad9bc8,
    recordScavenge=0x7f22954cbb30) at Database.cpp:1848
#12 0x0000000000effbed in Database::scavengeRecords (this=0x7f2297ad9bc8)
    at Database.cpp:1798
#13 0x0000000000f0022f in Database::scavenge (this=0x7f2297ad9bc8)
    at Database.cpp:1759
#14 0x0000000000f004cb in Database::scavengerThreadMain (this=0x7f2297ad9bc8)
    at Database.cpp:1923
#15 0x0000000000f005f9 in Database::scavengerThreadMain (
    database=0x7f2297ad9bc8) at Database.cpp:1912
#16 0x0000000000e9f4c7 in Thread::thread (this=0x7f22978e3410)
    at Thread.cpp:167
#17 0x0000000000e9f9d5 in Thread::thread (parameter=0x7f22978e3410)
    at Thread.cpp:146
#18 0x000000315b0073da in start_thread () from /lib64/libpthread.so.0
#19 0x000000315a4e627d in clone () from /lib64/libc.so.6

How to repeat:
$ perl runall.pl \
  --mysqld=--falcon-page-size=16K \
  --mem \
  --rows=1000 \
  --threads=4 \
  --mask=1487 \
  --queries=1000000 \
  --duration=300 \
  --basedir=/build/bzr/6.0-falcon-team \
  --engine=Falcon \
  --grammar=conf/combinations.yy \
  --gendata=conf/combinations.zz \
  --reporter=ErrorLog,Backtrace \
  --mysqld=--loose-falcon-lock-wait-timeout=1 \
  --mysqld=--log-output=none
[30 Jan 2009 5:46] Kevin Lewis
I reproduced this a couple timed today using RQG with --grammar=conf/repeatable_read.yy --gendata=conf/transactions.zz

The base record was committed with transaction=null, state=recDeleted and savepointId=11.
The prior record was the same transactionId and state=recLock but at savepointId=0.
Somehow, the savepointId 11 did not get released, so the lock records did not get scavenged.  I noticed that Transaction::syncSavepoints is only used for systemTransactions.  So I am experimenting with protecting all access to the Transaction::savepoints with this syncObject.
[12 Feb 2009 20:27] Kevin Lewis
After making the change to protect all access to Transaction::savepoints with   Transaction::syncSavepoints, I have not seen this error again.  I will check this in and watch to see if it occurs again.
[12 Feb 2009 20:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/66094

3016 Kevin Lewis	2009-02-12
      Bug#42340 - Error: assertion (oldestVisible->state != recLock) failed at line 143 in file
      RecordLeaf.cpp, RecordLeaf::pruneRecords().  The base record was committed with transaction=null, state=recDeleted and savepointId=11.
      The prior record was the same transactionId and state=recLock but at savepointId=0.
      Somehow, the savepointId 11 did not get released, so the lock records did not get
      scavenged. Some kind of race condition occurred to prevent the releaseSavePoint.
      Previously, Transaction::syncSavepoints was only used for systemTransactions.  This patch now uses that to protect all access to the Transaction::savepoints list.
[13 Feb 2009 13:13] Olav Sandstå
Fix looks very good. 

Only concern is that we are starting to fix more and more RQG bugs without having easy ways for adding test cases to verify and ensure the bug does not get re-introduced.
[14 Feb 2009 17:20] Kevin Lewis
After protecting all access to the savepoints list with Transaction::syncSavepoint, I once again managed to reproduce this assert.  So I'm still looking.
[25 Feb 2009 16:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/67591

3042 Kevin Lewis	2009-02-25
      Bug#42340 - RelordLeaf::pruneRecords() was hitting the assertion 
      (oldestVisible->state != recLock).   This was always a lock record at 
      savepoint 0 when the more recent record was recDeleted.  
      
      It is surprisingly OK for a committed record to have left over lock records
      in the prior record chain.  This happens when an updated or deleted record
      version is at a savepoint higher than the lock record. If a commit happens 
      instead of an explicit releae savepoint,the call to Transaction::releaseSavepoint() 
      is not made, which would have reduced the number of records on that same 
      transaction to 1.
      
      Instead, Transaction::commit() calls Transaction::releaseSavepoints().  This 
      function just clears out the savepoint objects and does not try to scavenge any 
      records from the prior record chain.  So they are left on there for the scavenger
      to clean up later.
      
      The new scavenger did not know this and assumed that a lock record left in the
      chain after a commit is wrong.  It is OK, so the ASSERT is herby deleted.
      
      In addition, RecordScavenger::inventoryRecord() is modfied to be smarter about
      where it indicates that pruning can start.  Since the record version newer than the 
      lock record is a real record, it is usually the one designated as the oldest visible.
      But if the timing is right, the lockRecord may get a temporary useCount by another
      thread.  If inventoryRecord looks at it just then, it would adjust the 'oldestVisible'
      designation to this lock record.  This is now changed to set oldestVisible to NULL.
      Then the next prior record will get that designation if it exists.
      
      Also, at the point where the oldestVisible is assigned, the ASSERT that it is not a
      lockRecord is changed to an 'if' since we now know that lockRecords can occur in the 
      prior record chain.
[2 Mar 2009 14:13] Bugs System
Pushed into 6.0.11-alpha (revid:alik@sun.com-20090302140208-lfdejjbcyezlhhjt) (version source revid:vvaintroub@mysql.com-20090214213452-djoygjmxbmbgp8hj) (merge vers: 6.0.10-alpha) (pib:6)
[2 Apr 2009 17:39] Bugs System
Pushed into 6.0.11-alpha (revid:hky@sun.com-20090402144811-yc5kp8g0rjnhz7vy) (version source revid:vvaintroub@mysql.com-20090225192413-izvigny7n3lr5rkz) (merge vers: 6.0.11-alpha) (pib:6)
[15 May 2009 15:17] MC Brown
Internal/test fix. No changelog entry required.