Bug #41357 | falcon.falcon_bug_34351_C fails with assertion IS_CONSISTENT_READ | ||
---|---|---|---|
Submitted: | 10 Dec 2008 14:26 | Modified: | 15 May 2009 16:07 |
Reporter: | Olav Sandstå | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S2 (Serious) |
Version: | 6.0.9-alpha | OS: | Any |
Assigned to: | Olav Sandstå | CPU Architecture: | Any |
Tags: | F_TRANSACTION |
[10 Dec 2008 14:26]
Olav Sandstå
[11 Dec 2008 10:38]
Olav Sandstå
Some information from the core file: (dbx) print *transaction *transaction = { connection = 0x1d4ed78 database = 0x1b13958 transactionId = 30847U oldestActive = 30846U blockedBy = 0 curSavePointId = 0 next = 0x4809238 prior = (nil) waitingFor = (nil) savePoints = (nil) freeSavePoints = 0x480a528 .... deferredIndexes = (nil) thread = 0x1d4d358 blockingRecord = (nil) backloggedRecords = (nil) startTime = 4 deferredIndexCount = 0 statesAllocated = 2 isolationLevel = 4 xidLength = 0 mySqlThreadId = 1 xid = (nil) states = 0x5b9eae8 commitTriggers = false systemTransaction = false hasUpdates = false writePending = true pendingPageWrites = false hasLocks = false ..... totalRecordData = 0 totalRecords = 0 chilledRecords = 0 chilledBytes = 0 thawedRecords = 0 thawedBytes = 0 debugThawedRecords = 0 debugThawedBytes = 0 committedRecords = 0 deletedRecords = 0 chillPoint = 0x480a878 scanIndexCount = 1 numberStates = 1 state = 0 dependencies = 0 useCount = 1 inList = 1 firstRecord = (nil) lastRecord = (nil) } This shows that the transaction->isolationLevel = 4 (TRANSACTION_WRITE_COMMITTED ) The record version used is: (dbx) print -r *(RecordVersion*)record *(class RecordVersion *) record = { RecordVersion::Record::data = { RecordVersion::Record::record = 0x3f24578 "^F" } RecordVersion::Record::useCount = 2 RecordVersion::Record::format = 0x1d4f648 RecordVersion::Record::recordNumber = 15388 RecordVersion::Record::size = 158 RecordVersion::Record::generation = 1ULL RecordVersion::Record::highWater = 1 RecordVersion::Record::encoding = '\004' RecordVersion::Record::state = '\0' RecordVersion::Record::active = '\001' RecordVersion::priorVersion = (nil) RecordVersion::virtualOffset = 16496258ULL RecordVersion::transaction = (nil) RecordVersion::nextInTrans = (nil) RecordVersion::prevInTrans = (nil) RecordVersion::transactionId = 30845U RecordVersion::savePointId = 0 RecordVersion::superceded = false } In this RecordVersion object the transaction pointer in NULL, which should indicate that its transaction is committed at this point. With the transction argument given to getRelativeState(...) being null and the isolation level being TRANSACTION_WRITE_COMMITTED, getRelativeState( ) both with the old and the new dependency manager should return CommittedVisible (which it according to both the assert and the core file does not).
[15 Jan 2009 20:08]
Olav Sandstå
Verified as described by running falcon_bug_34351_C repeatedly.
[20 Jan 2009 20:11]
Olav Sandstå
This assert happens due to the call to Transaction::getRelativeState() in Table::fetchForUpdate() returns CommittedInvisible for a recordVersion when the current transaction is running in isolation level TRANSACTION_WRITE_COMMITTED. This return value should only happen if the isolation level is TRANSACTION_CONSISTENT_READ. After having reproduced this assert several times with instrumentation in the source it seems like this problems is caused by that Transaction object that the recordVersion object has a pointer to and which is used for computing the relative state between the two transaction changes state during the execution of Transaction::getRelativeState(). The CommittedInvisible state seems to be reach by the following code path: 1. Transaction::getRelativeState() is called and the following code is run: if (transaction->state == Committed) { // Return CommittedVisible if the other trans has a lower TransId and // it was committed when we started. if (visible (transaction, transId, FOR_WRITING)) return CommittedVisible; return CommittedInvisible; } when this code is run the transaction->state == Committed (and the if test is true). This results in Transaction::visible() being called. 2. In the code for Transaction::visible() the following code is run: // If the other transaction is not yet committed, the trans is not visible. if (transaction->state != Committed) return false; When the same test now is evaluated "transaction->state != Committed" ths test too succeed. So based on traces in the code and studying the core file it seems like the transaction->state variable changes value from being Committed to not being Committed during the execution of these two methods. This leads to Transaction::visible() returning "false" which again leads to Transaction::getRelativeState() returning CommittedInvisible (and the assert occurs). So it seems like there is a transaction state change during the execution of Transaction::getRelativeState() and Transaction::visible() that leads to returning a wrong value. What makes this even more strange is that at least in one of the core files() the state of the transaction object is "back" to Committed inside the last if-test above (so the if test evaluates state to be != Committed but after having a assert inside the if block to trap this, the state in the core file is Committed). If this observation is correct the state of the transaction changes from Committed to !Committed and back to Committed. Seems there is need for reproducing this with even more debug tracing.
[13 Feb 2009 7:24]
Bugs System
Pushed into 6.0.10-alpha (revid:alik@sun.com-20090211182317-uagkyj01fk30p1f8) (version source revid:vvaintroub@mysql.com-20081210141141-rj6ts0a4hft4pfmo) (merge vers: 6.0.9-alpha) (pib:6)
[5 Mar 2009 9:21]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/68323 3050 Olav Sandstaa 2009-03-05 To reduce the probability of transaction inconsistencies due to RecordVersion's transaction pointer pointing to deleted Transaction objects (see for instance bug#41357) we temporarily disable deleting of old Transaction objects from the Transaction::commit(). After this all Transaction object will only be purged by the scavenger.
[23 Mar 2009 14:58]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/70075 3069 Olav Sandstaa 2009-03-23 Fix for Bug#41357 falcon.falcon_bug_34351_C fails with assertion IS_CONSISTENT_READ and potentially other related bugs. This assert was hit due to Transaction::getRelativeState() returning a wrong value. The cause was that the RecordVersion object's transaction pointer was refering to a deleted Transaction object. This patch fixes this problem by: 1. Introduces a new class called TransactionState. The following member variables in the Transaction class has been moved to this new class: transactionId; // used also as startEvent by dep.mgr. commitId; // used as commitEvent by dep.mgr. state; syncIsActive; waitingFor; // Used for deadlock detection Compared to the Transaction object which can be purged when its commit id is older than the transaction id of the oldest active transaction, the new TransactionState object will be available as long as there exits record versions refering to it. Reference counting is used for ensuring that transaction state object are not deleted while there are other object needing to acces itt. 2. Rewrites Transaction::getRelativeState(), Transaction::visible(), Transaction::needToLock() and Transaction::waitForTransaction() only use the transaction state object instead of accessing potentially deleted transaction objects. @ storage/falcon/Connection.cpp Access Transaction's state using member function instead of direct access to member variable @ storage/falcon/Database.cpp Access Transaction's state using member function instead of direct access to member variable @ storage/falcon/Makefile.am Added new files: TransactionState.h and .cpp @ storage/falcon/Record.cpp Added method for retrieving transaction state. @ storage/falcon/Record.h Added method for retrieving transaction state. @ storage/falcon/RecordVersion.cpp Added transaction state object. @ storage/falcon/RecordVersion.h Added transaction state object. @ storage/falcon/Table.cpp Use transaction state instead of transaction pointer when calling Transaction::getReletiveState @ storage/falcon/Transaction.cpp Moved transaction state from Transaction class to TransactionState class and converted main methods to use the TransactionState object instead of Transaction object when handling relationship to other transactions. @ storage/falcon/Transaction.h Moved transaction state from Transaction class to TransactionState class. @ storage/falcon/TransactionManager.cpp Introduce TransactionState in separate object. @ storage/falcon/TransactionState.cpp The transaction state for a Transaction. The members have been moved from the Transaction class. @ storage/falcon/TransactionState.h The transaction state for a Transaction. The members have been moved from the Transaction class.
[23 Mar 2009 18:00]
Kevin Lewis
Olav, I think you should go ahead and push this so that others can start working with it. You have already identified some improvements, and Jim needs to integrate it with his changes.
[24 Mar 2009 13:46]
Olav Sandstå
The patch is pushed into the mysql-6.0-falcon-team tree. I am setting this back to in progress given that there is more follow-up work to be done.
[24 Mar 2009 21:13]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/70276 3073 Olav Sandstaa 2009-03-24 Follow-up to patch for Bug#41357 falcon.falcon_bug_34351_C fails with assertion IS_CONSISTENT_READ In the initial fix there was a change to Transaction::needToLock(). This change assumed that the record pointer always referred to a RecordVersion object. This assumption is wrong. It is possible that the object is a Record object. This resulted in the ASSERT(transState != NULL) being hit when running falcon_backlog test. This patch changes the code in Transaction::needToLock() so that it works correctly both when the record parameter referes a RecordVersion object and a Record object. @ storage/falcon/Transaction.cpp Fix Transaction::needToLock() so that it handles the record argument points to a Record object as well as RecordVersion objects.
[27 Mar 2009 9:38]
Olav Sandstå
I have verified that the introduction of the new Transaction State object has solved this issue. Without these fixes I am able to reproduce the crash within about 200 runs of the falcon_bug_34351_C test. After applying the changes the same test has now run more than 5000 times without hitting this crash.
[28 Mar 2009 10:41]
Olav Sandstå
As suggested by Hakan, I have verified that the falcon_bug_34351_C-big test also runs without hitting this crash. I have repeated the test 2200 times without seeing any issues.
[2 Apr 2009 17:39]
Bugs System
Pushed into 6.0.11-alpha (revid:hky@sun.com-20090402144811-yc5kp8g0rjnhz7vy) (version source revid:jstarkey@nimbusdb.com-20090324150111-5yeayds2nppar7ce) (merge vers: 6.0.11-alpha) (pib:6)
[15 May 2009 16:07]
MC Brown
Internal/test fix. No changelog entry required.
[18 Jun 2009 22:24]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/76621 2737 Kevin Lewis 2009-06-18 This no longer fails. Fixed by Bug#41357. Moving the test from suite/falcon_team to suite/falcon