Bug #44946 | Falcon asserts on vector having wrong size in Record::getEncodedValue() | ||
---|---|---|---|
Submitted: | 19 May 2009 6:31 | Modified: | 26 May 2010 17:47 |
Reporter: | Olav Sandstå | Email Updates: | |
Status: | Unsupported | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S2 (Serious) |
Version: | 6.0.12-alpha | OS: | Linux |
Assigned to: | Kevin Lewis | CPU Architecture: | Any |
Tags: | F_RECORD TREE |
[19 May 2009 6:31]
Olav Sandstå
[19 May 2009 6:39]
Olav Sandstå
See also Bug#42131. The call stack is the same for both these bugs but it is believed to be a different issue since bug#42131 has been fixed and the code where the new crash occurs is different from earlier.
[19 May 2009 6:43]
Olav Sandstå
Kevin sent the following excellent description by email: ======================================================== The assert says that either Record::size is too small or the vector array is messed up, causing vector[index] to be too big. The previous fix was focused on making sure the record itself was thawed before this assert. But the current code guarantees that! It now thaws the record buffer into data.record, but sends the pointer to that data.record buffer back to Record::getRawValue() which sends that pointer into Record::getEncodedValue(). It does not matter if the transaction immediately chilled the record again. The vector in the assert is not using data.record. And the CycleManager keeps the buffer around. So the record buffer must be good. Maybe the vector of offsets is being filled in by multiple reading threads at the same time. After a thaw, which is serialized by syncThaw, it seems like both threads could call Record::getEncodedValue() at the same time. If two threads executed the following loop at the same time, some vector offsets may get skipped. void Record::getEncodedValue(char* recordData, int fieldId, Value *value) { . . . while (highWater < index) { const UCHAR *q = EncodedDataStream::skip(p); vector[++highWater] = (USHORT) (q - (UCHAR*) recordData); p = q; }
[20 May 2009 3:59]
Kevin Lewis
I implemented a change that allows multiple threads to update the field offset vector simultaneously, each using a copy of the highWater value. Then they use CAS to update the highWater index count if it is higher than what is currently there. That caused RecordVersion::highWater to be expanded from a short to an INTERLOCK_TYPE. The reason this can work is that all threads updating the vector are using the same recordData, and so the vector offsets are all the same. So multiple threads MAY be trying to write the same 2-byte offsets to the same cache lines. The question is whether this happens enough that cache line bouncing is more of a performance hit that the cost of serializing this code through a critical section or a syncObject protecting the vector. We still do not want to use a unique SyncObject per record, but it is possible to use an exclusive lock on one of the syncThaw objects for the table.
[20 May 2009 4:23]
Kevin Lewis
After implementing the previously mentioned code change for using CAS to update Record::highWater, I hit a crash at the same assert as this bug. I was using RQG combinations with a very low chill threshold. The problem that I found was that the chilled record is not thawable. Since it cannot be thawed, any garbage collection will fail. Here is the record stack. Each is a version of record number 26. POSITION TRANSID COMMITID SAVEPOINT SUPERCEEDED VIRTUAL_OFFSET THAWED base 964 967 0 false 17922085 deleted 1 952 962 0 false 16876114 yes 2 921 924 14 false 0 yes 3 921 924 0 true 12582986 no 4 901 918 0 false 0 yes 5 832 855 0 false 0 yes 6 800 807 2 false 6088067 yes The call stack indicated that record version 3,4,5 & 6 are being pruned by the scavenger. It has determined that 2 is the oldest visible version and correctly started pruning record 3. Notice that 2 and 3 are both on the same transaction. The scavenger disconnects the prior link between 2 and 3. Index::garbageCollect is called with the 'leaving' record as 3 and the 'staying' record is the base record. The problem is that record version 3 CANNOT be thawed. The transaction is fully completed. The Transaction object associated with 921 is no longer around, only the TransactionState. So Transaction::thaw() cannot be called. RecordVersion::thaw() normally reads the active record from the page in this case. But that would be a mistake if it succeeded, since record version 3 would now have the current record attached to it. Index::garbageCollect() would leave orphan key nodes in the index. But in this case, the base record is a deleted record so nothing can be thawed by going to the page. Since this leaving record is not thawed, garbageCollect fails in Index::makeKey-> Record::getValue-> Record::getRawValue-> Record::getEncodedValue() at the same assert as this bug. It is trying to access a recordData buffer == (-1). There are two ways to fix this problem. Either we (a) make sure there is only one version of each record number associated with every committed transaction, or (b) make sure every old record version is thawed before the transaction is completed. a) Transaction::commit calls Transaction::releaseSavepoints which does not call Transaction::releaseSavepoint for each savepoint. Instead of collapsing the record version chain down to one record version per transaction, it leaves them all linked up at the savepoint they were created at, and just gets rid of the savepoint objects. This can be changed to just call Transaction::releaseSavepoint() for each lower savepoint. b) In transaction::fullyCommitted(), before calling writeComplete(), I could add a new routine that thaws each record in the transaction, if needed. This is done by the gopher thread so it is probably a better location than Transaction::commit(). I think (b) is the best choice since it keeps statement latency low by not thawing some stuff at commit time. Letting the gopher threads do the extra required thawing is the best choice. This problem was caused by an unthawable record. We have had this discussion before, but I believe that Falcon should prevent any case of an unthawable record. So I will aslo add new ASSERTs into RecordVersion::hasRecord() and RecordVersion::getRecordData() so that they no longer fail to thaw anything that needs to be thawed.
[20 May 2009 4:47]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/74535 2708 Kevin Lewis 2009-05-19 Bug#44946 - ASSERT(vector[index] < size); is hit. -------------------------------- Record.cpp & Record.h Protect Record::highWater from being incremented in lock-step with another thread also running Record::getEncodedValue(). Do this by using a local copy of highWater and updating it by CAS only if it is higher than what is there. We assume that multiple threads can update the vector at the same time since they are all writing the same values to the vector. If multiple threads thawing the same record rarely ever happens, there will not be too much performance cost due to cache line bounces. And this patch allows it safely. In order to use CAS, Record::highWater must change from a short to an INTERLOCK_TYPE. Change the assert to a FATAL message if (vector[index] > size - getSize()). The size includes the bytes used by the Record or RecordVersion object. In Record::setEncodedRecord(), 'size' only needs to be adjusted for non-interlocked calls. The interlocked calls are done during a thaw. When a record is chilled, the size variable is left the same. -------------------------------- RecordVersion.cpp & SRLUpdateRecords.cpp RecordVersion::thaw() should also check the virtualOffset before calling Transaction::thaw(). And if the Transaction cannot thaw, the virtualOffset should be immediately set to zero. Add ASSERTs to RecordVersion::hasRecord() and RecordVersion::getRecordData() to make sure that there are no more 'unthawable' records. -------------------------------- Transaction.cpp & Transaction.h In order to prevent 'unthawable' records, make sure all records attached to a transaction are thawed by the gopher thread when the transaction is fully complete. Do this by adding a ne function; Transaction::thawAll()
[20 May 2009 17:08]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/74629 2709 Kevin Lewis 2009-05-20 Bug#44946 Code review changes; It is not necessary to us CAS to update the Record::highWater. Explained in comments. Added other comments and compiler warning fixes.