Bug #41357 falcon.falcon_bug_34351_C fails with assertion IS_CONSISTENT_READ
Submitted: 10 Dec 2008 14:26 Modified: 15 May 2009 16:07
Reporter: Olav Sandstå Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:6.0.9-alpha OS:Any
Assigned to: Olav Sandstå CPU Architecture:Any
Tags: F_TRANSACTION
Triage: Triaged: D1 (Critical)

[10 Dec 2008 14:26] Olav Sandstå
Description:
When running falcon_bug_34351_C it sometimes fails with the following assertion:

[Falcon] Error: assertion (IS_CONSISTENT_READ(transaction->isolationLevel)) failed at line 3532 in file Table.cpp

The call stack when this occurs is:

 [13] Error::assertionFailed(text = 0x157bb94 "IS_CONSISTENT_READ(transaction->isolationLevel)", fileName = 0x157bbc4 "Table.cpp", line = 3532), line 78 in "Error.cpp"
=>[14] Table::fetchForUpdate(this = 0x1cd56a8, transaction = 0x480a438, source = 0x3f244d8, usingIndex = true), line 3532 in "Table.cpp"
  [15] StorageDatabase::nextIndexed(this = 0x1d12128, storageTable = 0x4782698, recordBitmap = 0x1d14880, recordNumber = 15389, lockForUpdate = true), line 428 in "StorageDatabase.cpp"
  [16] StorageTable::nextIndexed(this = 0x4782698, recordNumber = 0, lockForUpdate = true), line 171 in "StorageTable.cpp"
  [17] StorageInterface::index_next(this = 0x3ce85c8, buf = 0x3ce8888 "ÿ"), line 1687 in "ha_falcon.cpp"
  [18] StorageInterface::read_range_first(this = 0x3ce85c8, start_key = 0x3ce8698, end_key = 0x3ce86b8, eq_range_arg = true, sorted = false), line 1655 in "ha_falcon.cpp"
  [19] handler::multi_range_read_next(this = 0x3ce85c8, range_info = 0xfffffd7ffd7226e0), line 4258 in "handler.cc"
  [20] QUICK_RANGE_SELECT::get_next(this = 0x1d1eb80), line 8596 in "opt_range.cc"
  [21] rr_quick(info = 0xfffffd7ffd722d48), line 322 in "records.cc"
  [22] mysql_delete(thd = 0x3c970e0, table_list = 0x3cd95a8, conds = 0x3cd9d48, order = 0x3cd8c40, limit = 18446744073709551615U, options = 0, reset_auto_increment = false), line 284 in "sql_delete.cc"
  [23] mysql_execute_command(thd = 0x3c970e0), line 3237 in "sql_parse.cc"
  [24] sp_instr_stmt::exec_core(this = 0x3cd9eb8, thd = 0x3c970e0, nextp = 0xfffffd7ffd725668), line 2910 in "sp_head.cc"
  [25] sp_lex_keeper::reset_lex_and_exec_core(this = 0x3cd9ef8, thd = 0x3c970e0, nextp = 0xfffffd7ffd725668, open_tables = false, instr = 0x3cd9eb8), line 2734 in "sp_head.cc"
  [26] sp_instr_stmt::execute(this = 0x3cd9eb8, thd = 0x3c970e0, nextp = 0xfffffd7ffd725668), line 2847 in "sp_head.cc"
  [27] sp_head::execute(this = 0x3ccc598, thd = 0x3c970e0), line 1241 in "sp_head.cc"
  [28] sp_head::execute_procedure(this = 0x3ccc598, thd = 0x3c970e0, args = 0x3c99540), line 1972 in "sp_head.cc"
  [29] mysql_execute_command(thd = 0x3c970e0), line 4234 in "sql_parse.cc"
  [30] mysql_parse(thd = 0x3c970e0, inBuf = 0x3cafb68 "CALL p1()", length = 9U, found_semicolon = 0xfffffd7ffd728be8), line 5737 in "sql_parse.cc"
  [31] dispatch_command(command = COM_QUERY, thd = 0x3c970e0, packet = 0x3ca0131 "CALL p1()", packet_length = 9U), line 1006 in "sql_parse.cc"
  [32] do_command(thd = 0x3c970e0), line 689 in "sql_parse.cc"
  [33] handle_one_connection(arg = 0x3c970e0), line 1156 in "sql_connect.cc"
  [34] _thr_setup(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff10fd7b 
  [35] _lwp_start(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff10ffb0 

How to repeat:
Run falcon_bug_34351_C repeatedly. I was able to reproduced it after having run it 4059 times.

Suggested fix:
Falcon should not crash
[11 Dec 2008 10:38] Olav Sandstå
Some information from the core file:

(dbx) print *transaction               
*transaction = {
    connection          = 0x1d4ed78
    database            = 0x1b13958
    transactionId       = 30847U
    oldestActive        = 30846U
    blockedBy           = 0
    curSavePointId      = 0
    next                = 0x4809238
    prior               = (nil)
    waitingFor          = (nil)
    savePoints          = (nil)
    freeSavePoints      = 0x480a528
    ....
    deferredIndexes     = (nil)
    thread              = 0x1d4d358
    blockingRecord      = (nil)
    backloggedRecords   = (nil)
    startTime           = 4
    deferredIndexCount  = 0
    statesAllocated     = 2
    isolationLevel      = 4
    xidLength           = 0
    mySqlThreadId       = 1
    xid                 = (nil)
    states              = 0x5b9eae8
    commitTriggers      = false
    systemTransaction   = false
    hasUpdates          = false
    writePending        = true
    pendingPageWrites   = false
    hasLocks            = false
    .....
        totalRecordData     = 0
    totalRecords        = 0
    chilledRecords      = 0
    chilledBytes        = 0
    thawedRecords       = 0
    thawedBytes         = 0
    debugThawedRecords  = 0
    debugThawedBytes    = 0
    committedRecords    = 0
    deletedRecords      = 0
    chillPoint          = 0x480a878
    scanIndexCount      = 1
    numberStates        = 1
    state               = 0
    dependencies        = 0
    useCount            = 1
    inList              = 1
    firstRecord         = (nil)
    lastRecord          = (nil)
}

This shows that the transaction->isolationLevel = 4 (TRANSACTION_WRITE_COMMITTED )

The record version used is:

(dbx) print -r *(RecordVersion*)record 
*(class RecordVersion *) record = {
    RecordVersion::Record::data         = {
        RecordVersion::Record::record = 0x3f24578 "^F"
    }
    RecordVersion::Record::useCount     = 2
    RecordVersion::Record::format       = 0x1d4f648
    RecordVersion::Record::recordNumber = 15388
    RecordVersion::Record::size         = 158
    RecordVersion::Record::generation   = 1ULL
    RecordVersion::Record::highWater    = 1
    RecordVersion::Record::encoding     = '\004'
    RecordVersion::Record::state        = '\0'
    RecordVersion::Record::active       = '\001'
    RecordVersion::priorVersion  = (nil)
    RecordVersion::virtualOffset = 16496258ULL
    RecordVersion::transaction   = (nil)
    RecordVersion::nextInTrans   = (nil)
    RecordVersion::prevInTrans   = (nil)
    RecordVersion::transactionId = 30845U
    RecordVersion::savePointId   = 0
    RecordVersion::superceded    = false
}

In this RecordVersion object the transaction pointer in NULL, which should
indicate that its transaction is committed at this point. 

With the transction argument given to getRelativeState(...) being null and the isolation level being TRANSACTION_WRITE_COMMITTED, getRelativeState( ) both with the old and the new dependency manager should return CommittedVisible (which it according to both the assert and the core file does not).
[15 Jan 2009 20:08] Olav Sandstå
Verified as described by running falcon_bug_34351_C repeatedly.
[20 Jan 2009 20:11] Olav Sandstå
This assert happens due to the call to Transaction::getRelativeState() in Table::fetchForUpdate() returns CommittedInvisible for a recordVersion when the current transaction is running in isolation level TRANSACTION_WRITE_COMMITTED. This return value should only happen if the isolation level is TRANSACTION_CONSISTENT_READ.

After having reproduced this assert several times with instrumentation in the source it seems like this problems is caused by that Transaction object that the recordVersion object has a pointer to and which is used for computing the relative state between the two transaction changes state during the execution of Transaction::getRelativeState().

The CommittedInvisible state seems to be reach by the following code path:

1. Transaction::getRelativeState() is called and the following code is run:

	if (transaction->state == Committed)
		{
		// Return CommittedVisible if the other trans has a lower TransId and 
		// it was committed when we started.
		
		if (visible (transaction, transId, FOR_WRITING))
			return CommittedVisible;

		return CommittedInvisible;
		}

  when this code is run the transaction->state == Committed (and the if test is true). This results in Transaction::visible() being called.

2. In the code for Transaction::visible() the following code is run:

   	// If the other transaction is not yet committed, the trans is not visible.

	if (transaction->state != Committed)
		return false;

    When the same test now is evaluated "transaction->state != Committed" ths test too succeed.

    So based on traces in the code and studying the core file it seems like the transaction->state variable changes value from being Committed to not being Committed during the execution of these two methods.

    This leads to Transaction::visible() returning "false" which again leads to Transaction::getRelativeState() returning CommittedInvisible (and the assert occurs).

So it seems like there is a transaction state change during the execution of Transaction::getRelativeState() and Transaction::visible() that leads to returning a wrong value.

What makes this even more strange is that at least in one of the core files() the state of the transaction object is "back" to Committed inside the last if-test above (so the if test evaluates state to be != Committed but after having a assert inside the if block to trap this, the state in the core file is Committed). If this observation is correct the state of the transaction changes from Committed to !Committed and back to Committed. Seems there is need for reproducing this with even more debug tracing.
[13 Feb 2009 7:24] Bugs System
Pushed into 6.0.10-alpha (revid:alik@sun.com-20090211182317-uagkyj01fk30p1f8) (version source revid:vvaintroub@mysql.com-20081210141141-rj6ts0a4hft4pfmo) (merge vers: 6.0.9-alpha) (pib:6)
[5 Mar 2009 9:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/68323

3050 Olav Sandstaa	2009-03-05
      To reduce the probability of transaction inconsistencies due to RecordVersion's
      transaction pointer pointing to deleted Transaction objects (see for instance bug#41357)
      we temporarily disable deleting of old Transaction objects from the Transaction::commit().
      
      After this all Transaction object will only be purged by the scavenger.
[23 Mar 2009 14:58] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/70075

3069 Olav Sandstaa	2009-03-23
      Fix for Bug#41357 falcon.falcon_bug_34351_C fails with assertion IS_CONSISTENT_READ
      and potentially other related bugs.
      
      This assert was hit due to Transaction::getRelativeState() returning a
      wrong value. The cause was that the RecordVersion object's transaction
      pointer was refering to a deleted Transaction object.
      
      This patch fixes this problem by:
      
      1. Introduces a new class called TransactionState. The following
         member variables in the Transaction class has been moved to this
         new class:
       
          transactionId;       // used also as startEvent by dep.mgr.
      	commitId;            // used as commitEvent by dep.mgr.
      	state;
      	syncIsActive;
      	waitingFor;          // Used for deadlock detection
      
         Compared to the Transaction object which can be purged when its commit
         id is older than the transaction id of the oldest active
         transaction, the new TransactionState object will be available as
         long as there exits record versions refering to it. Reference
         counting is used for ensuring that transaction state object are not
         deleted while there are other object needing to acces itt.
      
      2. Rewrites Transaction::getRelativeState(), Transaction::visible(),
         Transaction::needToLock() and Transaction::waitForTransaction()
         only use the transaction state object instead of accessing
         potentially deleted transaction objects.
     @ storage/falcon/Connection.cpp
        Access Transaction's state using member function instead of direct access to member variable
     @ storage/falcon/Database.cpp
        Access Transaction's state using member function instead of direct access to member variable
     @ storage/falcon/Makefile.am
        Added new files: TransactionState.h and .cpp
     @ storage/falcon/Record.cpp
        Added method for retrieving transaction state.
     @ storage/falcon/Record.h
        Added method for retrieving transaction state.
     @ storage/falcon/RecordVersion.cpp
        Added transaction state object.
     @ storage/falcon/RecordVersion.h
        Added transaction state object.
     @ storage/falcon/Table.cpp
        Use transaction state instead of transaction pointer when calling Transaction::getReletiveState
     @ storage/falcon/Transaction.cpp
        Moved transaction state from Transaction class to TransactionState class and converted main methods to use
        the TransactionState object instead of Transaction object when handling relationship to other transactions.
     @ storage/falcon/Transaction.h
        Moved transaction state from Transaction class to TransactionState class.
     @ storage/falcon/TransactionManager.cpp
        Introduce TransactionState in separate object.
     @ storage/falcon/TransactionState.cpp
        The transaction state for a Transaction. The members have been moved from the Transaction class.
     @ storage/falcon/TransactionState.h
        The transaction state for a Transaction. The members have been moved from the Transaction class.
[23 Mar 2009 18:00] Kevin Lewis
Olav,  I think you should go ahead and push this so that others can start working with it.  You have already identified some improvements, and Jim needs to integrate it with his changes.
[24 Mar 2009 13:46] Olav Sandstå
The patch is pushed into the mysql-6.0-falcon-team tree. I am setting this back
to in progress given that there is more follow-up work to be done.
[24 Mar 2009 21:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/70276

3073 Olav Sandstaa	2009-03-24
      Follow-up to patch for Bug#41357 falcon.falcon_bug_34351_C fails with assertion IS_CONSISTENT_READ
      
      In the initial fix there was a change to Transaction::needToLock(). This 
      change assumed that the record pointer always referred to a RecordVersion
      object. This assumption is wrong. It is possible that the object is a 
      Record object.
      
      This resulted in the ASSERT(transState != NULL) being hit when running
      falcon_backlog test. 
      
      This patch changes the code in Transaction::needToLock() so that it
      works correctly both when the record parameter referes a RecordVersion
      object and a Record object.
     @ storage/falcon/Transaction.cpp
        Fix Transaction::needToLock() so that it handles the record argument points to a Record object
        as well as RecordVersion objects.
[27 Mar 2009 9:38] Olav Sandstå
I have verified that the introduction of the new Transaction State object has solved this issue.

Without these fixes I am able to reproduce the crash within about 200 runs of the falcon_bug_34351_C test. After applying the changes the same test has now run more than 5000 times without hitting this crash.
[28 Mar 2009 10:41] Olav Sandstå
As suggested by Hakan, I have verified that the falcon_bug_34351_C-big test also runs without hitting this crash. I have repeated the test 2200 times without seeing any issues.
[2 Apr 2009 17:39] Bugs System
Pushed into 6.0.11-alpha (revid:hky@sun.com-20090402144811-yc5kp8g0rjnhz7vy) (version source revid:jstarkey@nimbusdb.com-20090324150111-5yeayds2nppar7ce) (merge vers: 6.0.11-alpha) (pib:6)
[15 May 2009 16:07] MC Brown
Internal/test fix. No changelog entry required.
[18 Jun 2009 22:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76621

2737 Kevin Lewis	2009-06-18
      This no longer fails.  Fixed by Bug#41357.  Moving the test from suite/falcon_team to suite/falcon