Bug #46377 falcon_chill_thaw_compare asserts in Record::setEncodedRecord()
Submitted: 24 Jul 2009 18:28 Modified: 26 May 2010 17:46
Reporter: Olav Sandstå Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:6.0.12-alpha OS:Any
Assigned to: Kevin Lewis CPU Architecture:Any

[24 Jul 2009 18:28] Olav Sandstå
Description:
The Random Query Generator test falcon_chill_thaw_compare fails due to Falcon crashing with the following error:

[Falcon] Error: assertion (size == getSize() + totalLength) failed at line 760 in file Record.cpp

This assert is in Record::setEncodedRecord():

	// This could be called during a thaw, in which case the size is already set.

	if (interlocked)
		{
		// Do not adjust the size.  The interlocked calls are only done during a thaw. 
		// When a record is chilled, the size variable is left the same.

		ASSERT(size == getSize() + totalLength);

The call stack from the crash:

#7  0x0856a628 in Error::debugBreak () at Error.cpp:94
#8  0x0856a6a4 in Error::error (
    string=0x89f30d0 "assertion (%s) failed at line %d in file %s\n")
    at Error.cpp:71
#9  0x0856a749 in Error::assertionFailed (
    text=0x89f3778 "size == getSize() + totalLength", 
    fileName=0x89f369c "Record.cpp", line=760) at Error.cpp:78
#10 0x0856f5fb in Record::setEncodedRecord (this=0xa5ba1b68, 
    stream=0xa72bf1e4, interlocked=true) at Record.cpp:760
#11 0x08600897 in RecordVersion::thaw (this=0xa5ba1b68)
    at RecordVersion.cpp:458
#12 0x086004d6 in RecordVersion::hasRecord (this=0xa5ba1b68)
    at RecordVersion.cpp:562
#13 0x085db269 in Index::duplicateKey (this=0xb70de308, key=0xa72c2118, 
    record=0xa5be8f08) at Index.cpp:748
#14 0x085dc83d in Index::update (this=0xb70de308, oldRecord=0xa5be8f08, 
    record=0xae7443a8, transaction=0xb7106058) at Index.cpp:674
#15 0x08591592 in Table::updateIndexes (this=0xb7421c58, 
    transaction=0xb7106058, record=0xae7443a8, oldRecord=0xa5be8f08)
    at Table.cpp:1291
#16 0x08591885 in Table::update (this=0xb7421c58, transaction=0xb7106058, 
    orgRecord=0xa5be8f08, stream=0xb7421800) at Table.cpp:3206
#17 0x08576f1f in StorageDatabase::updateRow (this=0xb70a2158, 
    storageConnection=0xb70d99f8, table=0xb7421c58, oldRecord=0xa5be8f08, 
    stream=0xb7421800) at StorageDatabase.cpp:735
#18 0x0857e1af in StorageTable::updateRow (this=0xb741c320, recordNumber=1005)
    at StorageTable.cpp:135
#19 0x0855ff58 in StorageInterface::update_row (this=0xb5f3bb0, 
    oldData=0xb5f41b0 "û", ' ' <repeats 199 times>..., 
    newData=0xb5f3f98 "û", ' ' <repeats 199 times>...) at ha_falcon.cpp:1241
#20 0x08407dbf in handler::ha_update_row (this=0xb5f3bb0, 
    old_data=0xb5f41b0 "û", ' ' <repeats 199 times>..., 
    new_data=0xb5f3f98 "û", ' ' <repeats 199 times>...) at handler.cc:5538
#21 0x08390464 in mysql_update (thd=0xb5d8e60, table_list=0xb5e30e0, 
    fields=@0xb5da244, values=@0xb5da478, conds=0xb5e38e0, order_num=0, 
    order=0x0, limit=18446744073709551614, handle_duplicates=DUP_ERROR, 
    ignore=false) at sql_update.cc:651
#22 0x082d655d in mysql_execute_command (thd=0xb5d8e60) at sql_parse.cc:3142
#23 0x082dc77f in mysql_parse (thd=0xb5d8e60, 
    inBuf=0xb5e2ec8 "UPDATE `table1000_falcon_int_autoinc` AS X SET `char_255_key` = -6.6881269216219e+18 WHERE X . `char_255_key` > -5.43246705051566e+17", 
    length=133, found_semicolon=0xa72c5e80) at sql_parse.cc:5979
#24 0x082dd901 in dispatch_command (command=COM_QUERY, thd=0xb5d8e60, 
    packet=0xb5dae99 "UPDATE `table1000_falcon_int_autoinc` AS X SET `char_255_key` = -6.6881269216219e+18 WHERE X . `char_255_key` > -5.43246705051566e+17", 
    packet_length=133) at sql_parse.cc:1064

How to repeat:
Run the RQG test falcon_chill_thaw_compare using the latest version from mysql-6.0-falcon-team tree.
[24 Jul 2009 18:29] Olav Sandstå
Setting to verified based on seeing it happen in both of the latest two pushes on both Linux and Windows in Pushbuild 2.
[24 Jul 2009 18:52] Kevin Lewis
I added this assert, so I will try to figure out why it could possibly fail.
[27 Jul 2009 12:01] Olav Sandstå
This crash was first seen (at least in Pushbuild) after the following push:

  http://lists.mysql.com/commits/78794
[19 Aug 2009 16:38] Kevin Lewis
I reproduced this today while debugging and running RQG combinations with record-memory-max=50Mb and record-chill-threshold=10Kb.  The record is being thawed in setEncodedRecord because a call is made to Index::duplicateKey.  So Falcon is comparing an updated record version with all previous versions to see if there is a conflict.  

The current base record is committed on transaction 1713 with savepoint 2 but the prior record is also transaction 1713, but with savepoint 0.  Since the transaction was committed and completed, the virtualOffset into the serial log was set to 0, so in order to thaw, the record must be fetched from the page.  But only the most recent version is in the page.  This is a record version that thawAll did not thaw at complete time.  It is a useless record anyway and should not be thawed, it should have been queuedForDelete by releaseSavepoint.

So I need to either 1) prevent duplicateKey from looking at these, 2) make sure that these extra useless committed records are thawed, or 3) make sure these extra useless records are not on the prior record chain after the transaction is committed.
[20 Aug 2009 14:52] Kevin Lewis
The easiest and most correct solution from above is 2 (Make sure thawAll works for these useless records).  Solution 1 (don't look at these data buffer) cannot be done because it is indeed necessary to have these data buffers around.  That is because indexes are not cleaned up until the record is scavenged. In order to scavenge the indexes, the data buffer must be known.  

Another reason is that indexes only contain one copy of any key value that points to this record number.  So if multiple record versions contain the same key value, that key value is not deleted during scavenging until the last record version with that value is scavenged.  In addition, new record version during an update must check all the previous record versions to see if it has a new key value for that record number.

Transaction::thawAll() runs by the gopher when the transaction completes in order to prevent unthawable records.  But it erroneously did not try to thaw records that were older than the oldest active record.  It needs to do this so that their key values can be scavenged correctly.  In addition, RecordVersion::thaw() does not try to thaw from the serial log if the transaction is no longer writePending.  Transaction::writeComplete is called immediately after thawAll() so normally, this would not matter if thawAll would get them all. But if there were some memory problems, thawAll may give up trying to thaw a a completing record.

So the best approach would be to thaw from the serial log whenever there is a virtualOffset recorded with the record.  SRLUpdateRecords::thaw() will set it to zero if that offset cannot be found, so if it is available, try to use it.  The serial log can contain multiple record versions, each with a different virtualOffset, but the data page can only contain one.
[20 Aug 2009 15:00] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/81183

2767 Kevin Lewis	2009-08-20
      Bug#46377 - Record::setEncodedRecord() asserts if a newly thawed record is not the same length as what was chilled.  This was happening because Transaction::thawAll was not thawing ALL old records a complete-time.  It was skipping records that were older than the oldest active, which could easily happen if the gophers get a little behind.  This patch changes Transaction::thawAll to thaw all older record versions.  In addition, RecordVersion::thaw() is changed to thaw from the serial log any time the virtualOffset still exists.  If that portion of the serial log is no longer available, the virstualOffset is reset to zero in SRLUpdateRecords::thaw()