Bug #41665 | Falcon crash in Transaction::waitForTransaction | ||
---|---|---|---|
Submitted: | 21 Dec 2008 17:58 | Modified: | 15 May 2009 16:17 |
Reporter: | Philip Stoev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S1 (Critical) |
Version: | 6.0-falcon-team | OS: | Any |
Assigned to: | Olav Sandstå | CPU Architecture: | Any |
Tags: | F_TRANSACTION |
[21 Dec 2008 17:58]
Philip Stoev
[21 Dec 2008 19:41]
Philip Stoev
Grammar file for bug 41665
Attachment: bug41665.yy (application/octet-stream, text), 1.44 KiB.
[21 Dec 2008 19:47]
Philip Stoev
To reproduce within 5 min: $ perl runall.pl \ --basedir=/build/bzr/6.0-falcon-team \ --grammar=bug41665.yy \ --gendata=conf/transactions.zz \ --engine=Falcon \ --queries=100000 \ --mysqld=--falcon-lock-wait-timeout=1 \ --mysqld=--log-output=file \ --mysqld=--falcon-record-memory-max=1G \ --mysqld=--falcon-page-cache-size=1G \ --mysqld=--skip-safemalloc \ --mem \ --duration=360 \ --mysqld=--max-connections=2048 \ --threads=50 The queries involved are simple UPDATEs that update records from a 10-row table over and over again.
[28 Jan 2009 5:28]
Kevin Lewis
Olav, I was running the following RQG script today, trying to reproduce 42340, and I hit this crash. (on windows, dual core) perl runall.pl \ --mysqld=--falcon-page-size=16K \ --rows=1000 \ --threads=4 \ --mask=1487 \ --queries=1000000 \ --duration=300 \ --basedir=C:\Work\bzr\Chg-09\mysql-6.0-falcon-team \ --engine=Falcon \ --grammar=conf/combinations.yy \ --gendata=conf/combinations.zz \ --reporter=ErrorLog,Backtrace \ --mysqld=--log-output=none Partial call stack; Transaction::waitForTransaction(Transaction * transaction=0x0420cc40, unsigned int transId=0, bool * deadlock=0x0739b3a3) Line 983 + 0xe bytes C++ Transaction::getRelativeState(Transaction * transaction=0x0420cc40, unsigned int transId=21002, unsigned int flags=1) Line 853 C++ Transaction::getRelativeState(Record * record=0x069b4e40, unsigned int flags=1) Line 808 + 0x39 bytes C++ Table::checkUniqueRecordVersion(int recordNumber=2915, Index * index=0x04177eb0, Transaction * transaction=0x04211238, RecordVersion * record=0x069b4f90, Sync * syncUnique=0x0739df5c) Line 2625 + 0xe bytes C++ Table::checkUniqueIndex(Index * index=0x04177eb0, Transaction * transaction=0x04211238, RecordVersion * record=0x069b4f90, Sync * sync=0x0739df5c) Line 2510 + 0x1f bytes C++ Table::insertIndexes(Transaction * transaction=0x04211238, RecordVersion * record=0x069b4f90) Line 1268 + 0x18 bytes C++ Table::insert(Transaction * transaction=0x04211238, Stream * stream=0x040e1800) Line 3056 C++ StorageDatabase::insert(Connection * connection=0x0419d0f8, Table * table=0x040bf878, Stream * stream=0x040e1800) Line 267 C++ StorageTable::insert() Line 109 + 0x28 bytes C++ So when this call was made, the record had a transaction attached. storage\falcon\Transaction.cpp(808): State state = getRelativeState(record->getTransaction() But by the time it got to the crash site, the record->transaction was null and the transaction memory location was freed. That makes this pretty much a duplicate of Bug#41357. A transaction can get purged just after its address was read from the RecorVersion object. The email chain 'Problems with record visibility and how it is computed' on falcon@lists.mysql.com adresses this problem.
[27 Mar 2009 18:13]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/70735 3087 Olav Sandstaa 2009-03-27 Fix for Bug #41665 Falcon crash in Transaction::waitForTransaction This crash occurred in the deadlock detector code. The current thread A is traversing the waitingFor list while another thread C has just aborted (or possibly committed) and is releasing its reference count on the transaction state object. A third thread, B, that is just in front of C in the waitingFor list is waken up, and also releases its reference count on C. If A is inspecting object C as this happens, this object is deleted - and can result in thread A crashing. The fix to Transaction::waitForTransaction(): Before releasing the reference count on the transaction state object we aquire an exclusive lock on the active transaction list. This will avoid that the transaction state object is released and deleted while another thread is transversing the waitingFor list. Also added a missing call to transState->release() in the catch block. @ storage/falcon/Transaction.cpp Fix to Transaction::waitForTransaction(): Before releasing the reference count on the transaction state object we aquire an exclusive lock on the active transaction list. The cause for this is that during the deadlock detektion code another thread might have used this transaction state object's waitingFor pointer to navigate to the next transaction state object - and this thread might be the only one that has a reference count on that transaction state object. This will avoid that the transaction state object is released and deleted while another thread is transversing the waitingFor list.
[28 Mar 2009 10:44]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/70762 3084 Olav Sandstaa 2009-03-28 Fix for Bug #41665 Falcon crash in Transaction::waitForTransaction This crash occurred in the deadlock detector code. The current thread A is traversing the waitingFor list while another thread C has just aborted (or possibly committed) and is releasing its reference count on the transaction state object. A third thread, B, that is just in front of C in the waitingFor list is waken up, and also releases its reference count on C. If A is inspecting object C as this happens, this object is deleted - and can result in thread A crashing. The fix to Transaction::waitForTransaction(): Before releasing the reference count on the transaction state object we aquire an exclusive lock on the active transaction list. This will avoid that the transaction state object is released and deleted while another thread is transversing the waitingFor list. Also added a missing call to transState->release() in the catch block. @ storage/falcon/Transaction.cpp Fix to Transaction::waitForTransaction(): Before releasing the reference count on the transaction state object we aquire an exclusive lock on the active transaction list. The cause for this is that during the deadlock detektion code another thread might have used this transaction state object's waitingFor pointer to navigate to the next transaction state object - and this thread might be the only one that has a reference count on that transaction state object. This will avoid that the transaction state object is released and deleted while another thread is transversing the waitingFor list.
[2 Apr 2009 17:38]
Bugs System
Pushed into 6.0.11-alpha (revid:hky@sun.com-20090402144811-yc5kp8g0rjnhz7vy) (version source revid:olav@sun.com-20090328084403-ojb0712tyxlemko4) (merge vers: 6.0.11-alpha) (pib:6)
[15 May 2009 16:17]
MC Brown
Internal/test fix. No changelog entry required.