Bug #34174 Infinite loop checking rolled back record in select for update
Submitted: 30 Jan 2008 19:43 Modified: 15 May 2009 17:06
Reporter: Ann Harrison Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version:6.0-falcon-team OS:Any
Assigned to: Kevin Lewis
Tags: F_ISOLATION
Triage: Triaged: D1 (Critical) / R2 (Low) / E2 (Low)

[30 Jan 2008 19:43] Ann Harrison
Description:
Under some circumstances, a rolled back record appears not
to be removed.  The code in Table::fetchForUpdate that checks
records goes into a loop around line 3387.

The script below runs T1's first actions alone, then
in another connection T2's actions.  T2 stalls on the
insert of (0,3) waiting for T1 to complete.  While it
is stalled, run T1's next set of actions.  When T1
rolls back, the T2 thread goes into the infinite loop.
Set a breakpoint on break after the case on state here

			case WasActive:
			case RolledBack:
				break;

to avoid losing the machine

How to repeat:
T1:

set @@autocommit=0;
create database db62;
use db62;
drop table if exists x1;
create table x1 (x1 int primary key, x2 int) engine=falcon;
set transaction isolation level serializable;
start transaction;
insert into x1 values (0,0);

T2:

set @@autocommit=1;
use db62;
insert into x1 values (1,1);
insert into x1 values (0,3);
update x1 set x1 = 0, x2 = 5;
insert into x1 values (0,6);

T1:

update x1 set x1 = 1, x2 = 4;
rollback;
[30 Jan 2008 20:49] Ann Harrison
you can (and should) remove the 
  set transaction isolation level serializable
statement from the script.  It's an artifact of 
an older problem
[31 Jan 2008 1:02] Godofredo Miguel Solorzano
Thank you for the bug report.
[31 Jan 2008 6:03] Kevin Lewis
Jim submitted the following patch.  I reviewed and tested it.

ChangeSet@1.2790, 2008-01-30 14:51:13-05:00, jas@rowvwade. +1 -0
  Clear RecordVersion::superceded bit when backing out
  a failed update.

  storage/falcon/Table.cpp@1.38, 2008-01-30 14:51:05-05:00, jas@rowvwade. +6 -0
    Clear RecordVersion::superceded bit when backing out
    a failed update.

diff -Nrup a/storage/falcon/Table.cpp b/storage/falcon/Table.cpp
--- a/storage/falcon/Table.cpp  2008-01-28 15:01:56 -06:00
+++ b/storage/falcon/Table.cpp  2008-01-30 13:51:05 -06:00
@@ -1189,6 +1189,9 @@ void Table::update(Transaction * transac

                if (record)
                        {
+                       if (record->priorVersion)
+                               record->priorVersion->setSuperceded(false);
+
                        if (record->state == recLock)
                                record->deleteData();

@@ -3034,6 +3037,9 @@ void Table::update(Transaction * transac

                if (record)
                        {
+                       if (record->priorVersion)
+                               record->priorVersion->setSuperceded(false);
+
                        if (record->state == recLock)
                                record->deleteData();
[1 Feb 2008 16:01] Hakan Küçükyılmaz
Test case for the fix is missing!
[25 Feb 2008 19:39] Kevin Lewis
Patch is in mysql-6.0-release version 6.0.4
[3 Mar 2008 2:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/43290

ChangeSet@1.2585, 2008-03-02 20:17:44-06:00, klewis@klewis-mysql. +3 -0
  Disable falcon_bug_34351_A & falcon_bug_34351_A for bug 34990
  Add testcase for Bug#34174
[12 Mar 2008 23:02] Bugs System
Pushed into 6.0.4-alpha
[2 May 2008 1:32] Paul Dubois
Noted in 6.0.4 changelog.

For Falcon, under some circumstances, a rolled back record could 
appear not to be removed.
[16 Dec 2008 20:44] Hakan Küçükyılmaz
Still fails from time to time with:

falcon_team.falcon_bug_34174   [ pass ]             17
falcon_team.falcon_bug_34174   [ pass ]             17
falcon_team.falcon_bug_34174   [ fail ]

mysqltest: At line 44: query 'UPDATE t1 SET f1 = 1, f2 = 4' failed with wrong errno 1205: 'Lock wait timeout exceeded; try restarting transaction', instead of 1213...
[19 Mar 2009 12:38] Kevin Lewis
Putting this Short Description back to its original cause and setting to 'Documenting'.  The original infinite loop was fixed for this bug before it was reopened for the wait lock timeout.  But Bug#41521 was opened and fixed for that problem.  So this bug should be closed. 

According to pushbuild xref, the testcaase for this bug was failing with a timeout quite often until a sleep was added to the test in mid January.  Since then, the test has failed only a few times.  I suggest increasing the sleep time.
[15 May 2009 17:06] MC Brown
A note has been added to the 6.0.11 changelog: 

With Falcon tables running concurrent transactions, some transactions may not be rolled back correctly, leading to an infinite loop.