Bug #28165 | Falcon: hang after select for update | ||
---|---|---|---|
Submitted: | 30 Apr 2007 15:35 | Modified: | 3 May 2007 10:51 |
Reporter: | Peter Gulutzan | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S2 (Serious) |
Version: | 6.0.0-alpha-debug | OS: | Linux (SUSE 10 64-bit) |
Assigned to: | Kevin Lewis | CPU Architecture: | Any |
[30 Apr 2007 15:35]
Peter Gulutzan
[30 Apr 2007 20:12]
Hakan Küçükyılmaz
Verified as described. Added test case falcon_bug_28165.test which hangs with 100% CPU usage. Best regards, Hakan
[2 May 2007 3:40]
Kevin Lewis
I debugged this problem today and I may have a solution. This is a very sticky problem. The hang happens because there is a recordVersion belonging to a previously rolled back transaction. The recordVersion is a lock record. The transaction no longer exists, but the Recordversion still has a pointer to it. So in Table::fetchForUpdate there is a for(;;) loop with a switch (state) in it that does a default: Log::debug("Table::fetchForUpdate: unexpected state %d\n", state); if the state is not recognized. This causes an infinite loop. The default case should throw an exception to prevent the hang. But the real problem is the left over lock record after the rollback. After stepping through this over and over, I finally realized that the Transaction::rollback should rollback records from newest to oldest just like Transaction::rollbackSavepoint() does. It pulls the recordVersions off of Transaction::records and stacks them in the opposite ofer, from newest to oldest. In this manner, Table::insert() can back up the most recent version of the record one at a time. The reason this is now showing up is the introductino of lock records. Before this, by the time a Transaction::rollback occured, there was only one version of a record on the transaction. Now, there can be a lock record before a pending record version.
[2 May 2007 6:18]
Kevin Lewis
Pushed the change that resorts the pending records so that they are rolled back from newest to oldest. The test case now completes without hanging, and get the expected results, except that the error message does not identify the correct key value. But this problem is reported as Bug#28158.
[2 May 2007 7:07]
Hakan Küçükyılmaz
Test case falcon_bug_28165 passes now except for wrong error message which is Bug#28158. Best regards, Hakan
[3 May 2007 10:51]
MC Brown
A note has been added to the 6.0.0 changelog.