MySQL Bugs: #43961: Falcon assertion (false) in Table::rollbackRecord with no backlogging

Bug #43961	Falcon assertion (false) in Table::rollbackRecord with no backlogging
Submitted:	30 Mar 2009 14:45	Modified:	26 May 2010 17:47
Reporter:	Philip Stoev	Email Updates:
Status:	Unsupported	Impact on me:	None
Category:	MySQL Server: Falcon storage engine	Severity:	S1 (Critical)
Version:	6.0-falcon-team	OS:	Any
Assigned to:	Kevin Lewis	CPU Architecture:	Any
Tags:	F_RECORD TREE

Description:
The falcon_backlog test in PB2 failed with the stack trace below. This is identical to bug #43504, however currently backlogging is disabled. Therefore the cause must be something else and Kevin requested that this bug is opened.

#9  0x0855e305 in Error::assertionFailed (text=0x89ebd86 "false", 
    fileName=0x89ebc6d "Table.cpp", line=1083) at Error.cpp:78
#10 0x0858007f in Table::rollbackRecord (this=0xb7360080, 
    recordToRollback=0xa5d8e708, transaction=0xb711d290) at Table.cpp:1083
#11 0x085fa7d1 in RecordVersion::rollback (this=0xa5d8e708, 
    transaction=0xb711d290) at RecordVersion.cpp:204
#12 0x08589f27 in Transaction::rollbackSavepoint (this=0xb711d290, 
    savePointId=39) at Transaction.cpp:1346
#13 0x08567884 in StorageDatabase::savepointRollback (this=0xb7020158, 
    connection=0xb70baa50, savePoint=39) at StorageDatabase.cpp:556
#14 0x08564e5d in StorageConnection::savepointRollback (this=0xb70ba9e0, 
    savePoint=39) at StorageConnection.cpp:232
#15 0x08564cc9 in StorageConnection::rollbackVerb (this=0xb70ba9e0)
    at StorageConnection.cpp:373
#16 0x08556287 in StorageInterface::rollback (hton=0xa4592f8, thd=0xabcd818, 
    all=false) at ha_falcon.cpp:1331
#17 0x084091c6 in ha_rollback_trans (thd=0xabcd818, all=false)
    at handler.cc:1298
#18 0x084d55cd in trans_rollback_stmt (thd=0xabcd818) at transaction.cc:263
#19 0x082dbccb in dispatch_command (command=COM_QUERY, thd=0xabcd818, 
    packet=0xabd32d1 "UPDATE `table65536_falcon_int_autoinc` AS X SET `char_255_utf8` = 'qbqbbrbrvtvtksksururmomosxsxadadvhvhceceewewpgpgvdvdewewsysymhmhwvwvqeqenvnvxuxuajajwswsdududedeskskepeprsrsacaciyiypcpcflflititeoeoo"..., 
    packet_length=343) at sql_parse.cc:1431
#20 0x082dc242 in do_command (thd=0xabcd818) at sql_parse.cc:739
#21 0x082c93bb in handle_one_connection (arg=0xabcd818) at sql_connect.cc:1146
#22 0x0089a45b in start_thread () from /lib/libpthread.so.0
#23 0x007f1c4e in clone () from /lib/libc.so.6

How to repeat:
See below.

Log:

http://loki41.norway.sun.com/~bteam/pb2/web_loki41.py?action=archive_download&archive_id=4...

Push, core, source, binary:

http://loki41.norway.sun.com/~bteam/pb2/web_loki41.py?template=push_details&push=306821&bu...

I got this crash with identical call stack when running the falcon_backlog test on the latest source from falcon team.

The last lines from mysqld's log file contains:

727: Record thaw (fetch): table 25, 1 records, 2431 bytes
728: Record thaw (fetch): table 25, 1 records, 2206 bytes
728: Record thaw (fetch): table 25, 1 records, 2496 bytes
728: Record thaw (fetch): table 25, 1 records, 1726 bytes
Table::rollbackRecord failed:
  0x2aaaab0b0248        Id 15676, enc 4, state 6, tid 797, use 2, grp 4, prior 0
x2aaaab29adc8
  0x2aaaab29adc8        Id 15676, enc 4, state 0, tid 68, use 4, grp 4, prior (n
il)
[Falcon] Error: assertion (false) failed at line 1083 in file Table.cpp

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

http://lists.mysql.com/commits/71557

3105 Kevin Lewis 2009-04-07

Bug#43961 & Bug#41661 - Table::insertIntoTree could fail during a Table::insert. This should not happen since Dbb::insertStub should have returned a record number that was free. Nobody else should have been messing with that slot in the RecordLeaf.

The problem was in RecordLeaf::retireRecords. This function called
RecordVersion::retire which then called expungeRecord. This makes the record number free to use again by any other thread. It did this with an exclusive lock on the RecordLeaf, but Table::insertIntoTree often does not get any lock on RecordLeaf when it calls RecordLeaf::store. So a free record number is
reused in Table::insert before the slot has been set to NULL by RecordLeaf::retireRecords. That is why Table::insertIntoTree fails. It expects the slot to be NULL.

Another problem fixed is that the first of the two Table::insert functions did not check the return value of insertIntoTree at all. This would have propagated the error further since there would be nothing in the RecordTree slot after the insert.

Another problem fixed in RecordLeaf::retireRecords is with this statement;
if (record && recordScavenge->canBeRetired(record))
The problem is with incrementing the 'count' integer. If it is zero after stepping through the RecordLeaf, then it is assumed that the RecordLeaf is empty and can be retired. But this 'if' statement gets the wrong result. If the RecordLeaf is full of newer records that cannot yet be retired, then this count will remain zero. The emptySections bitmap will be set for this RecordLeaf. And later, in Table::retireRecords, it will attempt to
retire these sections. But finding records in them, it will just waste a lot of time while blocking all table access.

I changed this code to be accurate in determining if the RecordLeaf is indeed empty before setting the bit in Table::emptySections.

Record::retire and RecordVersion::retire now only retire records. The more thorough tests for whether a record can be retired is collected into
RecordScavenge::canBeRetired.

Also, instead of just setting the slot in RecordLeaf to NULL directly, I now use a COMPARE_EXCHANGE_POINTER() just like every other place where this location is changed. Then if it fails for any reason, we will not retire that record. One legitimate reason that this could fail is when this record is determined that it canBeRetired, but immediately after that, a client thread does a fetchForUpdate, adding a lockRecord.

Here is how this error could have happened with the code before this patch;

* Scavenger thread;  Calls RecordVersion::retire which calls expungeRecord, which releases the record number to be used again.

* Client thread 1;  Calls Table::insert( Transaction *, int, Field **, Value **), which calls insertIntotree.  This fails since the slot is not currently NULL as expected.  But this code did not check the return code so it assumes that insertIntoTree succeeded and goes on with the insert.

* Scavenger thread;  sets RecordLeaf::records[slot] to NULL;

* Client thread 2; calls StorageInterface::rnd_next -> StorageDatabase::nextRow -> Table::fetch.  The record should exist according to recordBitmap, but nothing is in the record tree.  So this record number is read from the page.  This is probably an old version of that record number, a recently deleted record that the gopher threads have not yet removed from the page.  (This misstep is not even necessary, but it could happen, which can lead to other problems) 

* Client Thread 1; Something else went wrong with the insert.  Maybe a duplicate conflict.  An exception is thrown and caught in Table::insert.  Another call to insertIntoTree is made to put NULL in the record tree slot, but the wrong record is there so it fails.  But Table::insert does not check the return value here either! (This may lead to the following rollback, or even something else could have caused it.  If this insert does not rollback, other issues could arise.)

* Client Thread 1; Rollback;  calls StorageInterface::rollback -> StorageConnection::rollbackVerb -> StorageConnection::savepointRollback -> StorageDatabase::savepointRollback -> Transaction::rollbackSavepoint -> RecordVersion::rollback().   The transaction has a pointer to the inserted record that really did not get into the record tree.  Table::insertIntoTree fails for this record because it is not there. No way to recover so it asserts as in this bug.

This still happens, however the code has been changed

        if (!insertIntoTree(priorRecord, recordToRollback, recordToRollback->recordNumber))
                {
                if (priorRecord == NULL && priorState == recDeleted)
                        return;

                // The store of this record into the record leaf failed. No way to recover.
                // While the base record is uncommitted, only that trensaction can change it.

                recordToRollback->printRecord("Table::rollbackRecord failed");
                bool rollbackRecord_DamagedRecordTree = false;
                ASSERT(rollbackRecord_DamagedRecordTree);
                }

and therefore the assertion is different.

bool rollbackRecord_DamagedRecordTree = false;
   ASSERT(rollbackRecord_DamagedRecordTree);

seems a little bit constructed. I think 

FATAL("damaged record tree");

could be used, does not require an extra bool variable and allows any text, not only C++

I was able to reproduce this in the debugger and am looking at it. 
At the top of Table::rollbackRecord) there are these lines;
	int priorState = recordToRollback->state;
	recordToRollback->state = recRollback;
The curious thing is that priorState == recRollback.  This means that recordToRollback has already been rolled back once before.  So this record must somehow have been put on the transaction list twice.

On 27 Apr, 2009, 14:12, Hakan Kuecuekyilmaz reported duplicate bug 44491.
Description:
Running DBT2 on caneland with 20 warehouses and 192 hit
    Signal 6 with Table::rollbackRecord-insertIntoTree failed, priorState =

The priorState did not print out due to an error in this new FATAL string.

With the last fix that includes printing of the priorState, this crash produced the following output:

Table::rollbackRecord failed:
  0xa5d94628	Id 947, enc 4, state 6, tid 66, use 1, grp 3, prior (nil)
[Falcon] Error: Table::rollbackRecord-insertIntoTree failed, priorState =12

The constant 12 is representing the following record state:

static const int recQueuedForDelete = 12;		// end of chain for garbage collection

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

http://lists.mysql.com/commits/75562

2717 Kevin Lewis 2009-06-03
Bug#43961 & Bug#44223 Both of these bugs are caused by a record that gets queued for delete while still chained up to the transaction. Then during Transaction::rolback (44223) a record to rollback is queued for delete a second time.
Or during Transaction::rollbackSavepoint (43961), the code assumes that this record on the transaction and savepoint is still the base record since it has Record::superceded=false, and tries to call insertIntoTree to rerplace it with its prior. But this record is not the base record either.

This can happen in Table::update(Transaction * transaction, Record *orgRecord, Stream *stream);
If this update replaces a lock record with an updated record and an exception occurred, it may or may not take the record off the transaction and the tree, but it would always queue the lock record for delete. This code has gone through some iterations lately. But in the most recent code, I was able to reporduce 44223.

This patch cleans it up even more by distinguishing whether a record is actually inserted into the tree and added to the transaction. It undoes those actions only if they were done. Then if the update started with a lock record, that recordversion will remain on the tree with superceded=false and it will not be queued for delete. It correctly uses the 'wasLock' variable instead of the 'updated' or 'wasUpdated' flag.