Bug #44721 | falcon_chill_thaw_compare crashes in EncodedDataStream::decode | ||
---|---|---|---|
Submitted: | 7 May 2009 14:08 | Modified: | 26 May 2010 17:47 |
Reporter: | Olav Sandstå | Email Updates: | |
Status: | Unsupported | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S2 (Serious) |
Version: | 6.0.12-alpha | OS: | Any |
Assigned to: | Kevin Lewis | CPU Architecture: | Any |
Tags: | F_CHILL THAW |
[7 May 2009 14:08]
Olav Sandstå
[7 May 2009 14:11]
Olav Sandstå
Note that this crash is very similar to the crash in bug#43146. The only difference in the call stack is that in this case the access to illegal data occurs in EncodedDataStream::decode() instead of in EncodedDataStream::skip().
[7 May 2009 15:18]
Kevin Lewis
I am in the process of making changes to the chill/thaw code so that data.record buffers are atomically identified as being thawed. In addition, becuase a newly thawed record can be chilled again immediately, the functions involved in thawing return the buffer that was thawed up the call stack to where it is needed. This seems to be the conflict going on here. A data buffer is not what it is expected to be.
[8 May 2009 19:44]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/73693 2702 Kevin Lewis 2009-05-08 Bug#43344 & Bug#44721 This series of changes closes many gaps in which a data buffer can get thawed and then re-chilled almost immediatly which can cause the use of a bad data buffer pointer. For 44344, it makes the act of chilling and thawing records an atomic operation which uses CAS. Each call to thaw returns a valid pointer to the buffer that was thawed so that it can be used during the current cycle as long as the caller has a CycleLock. Record.h, Record.cpp, RecordVersion.h & RecordVersion.cpp; The most significant change is that when a data buffer is chilled, it recieves the value (-1) and this happens atomically. Record::state == recChilled is no longer used. The record.data pointer is now always read once, checked, and returned up the call stack to where it is used. hasRecord() is changed from an inline function to a virtual function for both Record and RecordVersion. hasRecord() and isChilled() are added so that the record can check data.record itself. setEncodedRecord() and thaw() return the pointer to the buffer that they just created and put into data.record so that the caller does not have to rely on that buffer being there later. getAllocatedSize() is added so that the number of bytes in a buffer holding the record can be consistently known. It is used in several places. chillData() is similar to and replaces deleteData() during SRLUpdateRecords::chill isDeleted() is also added so that the common (state == recDeleted) check can be done by the record itself. This is not fully implemented yet because we need to get away from the over-use of these state flags. They are somewhat unreliable on multiple CPUs unless they are set with CAS. getRecordData() is also changed from an inline to a virtual function. findRecordData() is like getRecordData but without the thaw. It is used in three places along the call stack for thawing a record to check if another thread has already thawed that record. This is not likely now because of the use of syncThaw. But syncThaw may no longer be necessary. SRLUpdateRecords.h & SRLUpdateRecords.cpp; thaw() returns the data buffer thawed in addition to the number of bytes reallocated. chill() is not sent the dataLength anymore since it calculates that itself. SerialLog.h, SerialLog.cpp, transaction.h, transaction.cpp; Since the number of bytes chilled and thawed stored in the SerialLog and Transaction were being maintained but not used, I deleted those variables and code used to maintain them. The number of allocated bytes associated with a Transaction is stored in totalRecorddata and is still maintained. transaction.cpp Even though backlogging is not currently active, the choice whether to backlog any particular record needed to be enhanced. Along with replacing hasRecord(false) with isChilled(), it should avoid backlogging record chains that do not start at the base record. That is checked with isSuperceded(). Also, backlogging should occur for deleted records as well as newly chilled records.
[12 May 2009 14:36]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/73828 2702 Kevin Lewis 2009-05-12 Bug#43344 & Bug#44721 Second patch (comprehensive) with review changes This series of changes closes many gaps in which a data buffer can get thawed and then re-chilled almost immediatly which can cause the use of a bad data buffer pointer. For 44344, it makes the act of chilling and thawing records an atomic operation which uses CAS. Each call to thaw returns a valid pointer to the buffer that was thawed so that it can be used during the current cycle as long as the caller has a CycleLock. Record.h, Record.cpp, RecordVersion.h & RecordVersion.cpp; The most significant change is that when a data buffer is chilled, it recieves the value (-1) and this happens atomically. Record::state == recChilled is no longer used. The record.data pointer is now always read once, checked, and returned up the call stack to where it is used. hasRecord() is changed from an inline function to a virtual function for both Record and RecordVersion. hasRecord() and isChilled() are added so that the record can check data.record itself. setEncodedRecord() and thaw() return the pointer to the buffer that they just created and put into data.record so that the caller does not have to rely on that buffer being there later. getEncodedValue() uses the recordData pointer of the caller instead of getting it again from data.record. getAllocatedSize() is added so that the number of bytes in a buffer holding the record can be consistently known. It is used in several places. chillData() is similar to and replaces deleteData() during SRLUpdateRecords::chill isDeleted() is also added so that the common (state == recDeleted) check can be done by the record itself. This is not fully implemented yet because we need to get away from the over-use of these state flags. They are somewhat unreliable on multiple CPUs unless they are set with CAS. getRecordData() is also changed from an inline to a virtual function. findRecordData() is like getRecordData but without the thaw. It is used in three places along the call stack for thawing a record to check if another thread has already thawed that record. This is not likely now because of the use of syncThaw. But syncThaw may no longer be necessary. isAllocated() is an inline function used about 6 times to determine if a pointer is real or not. SRLUpdateRecords.h & SRLUpdateRecords.cpp; thaw() returns the data buffer thawed in addition to the number of bytes reallocated. chill() is not sent the dataLength anymore since it calculates that itself. SerialLog.h, SerialLog.cpp, transaction.h, transaction.cpp; Since the number of bytes chilled and thawed stored in the SerialLog and Transaction were being maintained but not used, I deleted those variables and code used to maintain them. The number of allocated bytes associated with a Transaction is stored in totalRecorddata and is still maintained. transaction.cpp Even though backlogging is not currently active, the choice whether to backlog any particular record needed to be enhanced. Along with replacing hasRecord(false) with isChilled(), it should avoid backlogging record chains that do not start at the base record. That is checked with isSuperceded(). Also, backlogging should occur for deleted records as well as newly chilled records.
[12 May 2009 22:59]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/73882 2704 Kevin Lewis 2009-05-12 Code Cleanup and compile warnings Fix the previous patch for Bug#43344 & Bug#44721 that made a call to getAllocatedSize() from inside a thaw. This caused a recursive loop and a stack overflow.