Bug #43504 Falcon DBT2 crash in Table::rollbackRecord()
Submitted: 9 Mar 2009 12:39 Modified: 15 May 2009 16:13
Reporter: Christopher Powers Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0 OS:Any
Assigned to: Christopher Powers CPU Architecture:Any
Tags: F_BACKLOG
Triage: Triaged: D1 (Critical)

[9 Mar 2009 12:39] Christopher Powers
Description:
DBT2 test crash found by Kelly Long. This is with current Falcon tree as of a few days ago. It occurs frequently and thus DBT2 cannot complete.

GDB session output:

[Switching to Thread 0x7ffec6ebb950 (LWP 18932)]
Breakpoint 1, Table::rollbackRecord (this=0x7ffff6d54d48,
    recordToRollback=0xba37d278, transaction=0x7ffec062fba0) at Table.cpp:1075
1075    recordToRollback->printRecord("Table::rollbackRecord
failed");
Missing separate debuginfos, use: debuginfo-install glibc-2.9-3.x86_64
libgcc-4.3.2-7.x86_64 libstdc++-4.3.2-7.x86_64
(gdb) where
#0  Table::rollbackRecord (this=0x7ffff6d54d48, recordToRollback=0xba37d278,
    transaction=0x7ffec062fba0) at Table.cpp:1075
#1  0x00000000008f344e in Transaction::rollbackSavepoint (
    this=0x7ffec062fba0, savePointId=4) at Transaction.cpp:1274
#2  0x00000000008d21bd in StorageDatabase::savepointRollback (
    this=<value optimized out>, connection=<value optimized out>, savePoint=4)
    at StorageDatabase.cpp:507
#3  0x00000000008d05b6 in StorageConnection::rollbackVerb (this=0x9acb20f8)
    at StorageConnection.cpp:373
#4  0x00000000008cd223 in StorageInterface::rollback (
    hton=<value optimized out>, thd=0x7ffec82232a0, all=false)
    at ha_falcon.cpp:1326
#5  0x00000000007a8fde in ha_rollback_trans (thd=0x7ffec82232a0, all=false)
    at handler.cc:1297
#6  0x000000000084533d in trans_rollback_stmt (thd=0x7ffff6d54d78)
    at transaction.cc:263
#7  0x00000000006b9908 in dispatch_command (command=<value optimized out>,
    thd=0x7ffec82232a0, packet=0x0, packet_length=2118888456)
    at sql_parse.cc:1383
#8  0x00000000006acd90 in handle_one_connection (arg=<value optimized out>)
    at sql_connect.cc:1146
#9  0x000000344c6073da in start_thread () from /lib64/libpthread.so.0
#10 0x000000344bae62bd in clone () from /lib64/libc.so.6
(gdb) p priorRecord
$1 = (class Record *) 0x7ffecf2395d0
(gdb) p *priorRecord
$2 = {_vptr.Record = 0xd3f6f0, data = {record = 0xc719f8b0 "\022"},
  useCount = 3, format = 0x7ffff6a14f60, recordNumber = 482100, size = 94,
  generation = 50, highWater = 7, encoding = 4 '\004', state = 0 '\0'}
(gdb) p priorState
$3 = 4
(gdb) p *(RecordVersion *)priorRecord
$4 = {<Record> = {_vptr.Record = 0xd3f6f0, data = {
      record = 0xc719f8b0 "\022"}, useCount = 3, format = 0x7ffff6a14f60,
    recordNumber = 482100, size = 94, generation = 50, highWater = 7,
    encoding = 4 '\004', state = 0 '\0'}, priorVersion = 0x0,
  virtualOffset = 0, transaction = 0x0, nextInTrans = 0x0, prevInTrans = 0x0,
  transactionId = 0, savePointId = 0, superceded = false}
(gdb) p * recordToRollBack
No symbol "recordToRollBack" in current context.
(gdb) p recordToRollback
$5 = (class RecordVersion *) 0xba37d278
(gdb) p *recordToRollback
$6 = {<Record> = {_vptr.Record = 0xd3fdf0, data = {record = 0x0},
    useCount = 1, format = 0x7ffff6a14f60, recordNumber = 482100, size = 112,
    generation = 50, highWater = 2, encoding = 0 '\0', state = 6 '\006'},
  priorVersion = 0x7ffecf2395d0, virtualOffset = 0,
  transaction = 0x7ffec062fba0, nextInTrans = 0x0, prevInTrans = 0x0,
  transactionId = 11258, savePointId = 4, superceded = false}
(gdb) help running
Running the program.

(gdb) l
1070    if (priorRecord == NULL && priorState == recDeleted)
1071    return;
1072
1073    // The store of this record into the record leaf failed.
No way to recover.
1074
1075    recordToRollback->printRecord("Table::rollbackRecord
failed");
1076    //insert(priorRecord, recordToRollback,
recordToRollback->recordNumber);
1077    ASSERT(false);
1078    }
1079
(gdb) where
#0  Table::rollbackRecord (this=0x7ffff6d54d48, recordToRollback=0xba37d278,
    transaction=0x7ffec062fba0) at Table.cpp:1075
#1  0x00000000008f344e in Transaction::rollbackSavepoint (
    this=0x7ffec062fba0, savePointId=4) at Transaction.cpp:1274
#2  0x00000000008d21bd in StorageDatabase::savepointRollback (
    this=<value optimized out>, connection=<value optimized out>, savePoint=4)
    at StorageDatabase.cpp:507
#3  0x00000000008d05b6 in StorageConnection::rollbackVerb (this=0x9acb20f8)
    at StorageConnection.cpp:373
#4  0x00000000008cd223 in StorageInterface::rollback (
    hton=<value optimized out>, thd=0x7ffec82232a0, all=false)
    at ha_falcon.cpp:1326
#5  0x00000000007a8fde in ha_rollback_trans (thd=0x7ffec82232a0, all=false)
    at handler.cc:1297
#6  0x000000000084533d in trans_rollback_stmt (thd=0x7ffff6d54d78)
    at transaction.cc:263
#7  0x00000000006b9908 in dispatch_command (command=<value optimized out>,
    thd=0x7ffec82232a0, packet=0x0, packet_length=2118888456)
    at sql_parse.cc:1383
#8  0x00000000006acd90 in handle_one_connection (arg=<value optimized out>)
    at sql_connect.cc:1146
#9  0x000000344c6073da in start_thread () from /lib64/libpthread.so.0
#10 0x000000344bae62bd in clone () from /lib64/libc.so.6
(gdb) l
1080    if (!priorRecord && recordToRollback->recordNumber >= 0)
1081    deleteRecord(recordToRollback, transaction);
1082
1083    garbageCollect(recordToRollback, priorRecord, transaction, true);
1084
1085    if (backloggedRecords)
1086    deleteRecordBacklog(recordToRollback->recordNumber);
1087
1088    if (priorRecord)
1089    priorRecord->release(REC_HISTORY);
(gdb) frame 1
#1  0x00000000008f344e in Transaction::rollbackSavepoint (
    this=0x7ffec062fba0, savePointId=4) at Transaction.cpp:1274
1274    rec->rollback(this);
(gdb) l
1269    while (stack)
1270    {
1271    RecordVersion *rec = stack;
1272    stack = rec->nextInTrans;
1273    rec->nextInTrans = NULL;
1274    rec->rollback(this);
1275    SET_RECORD_ACTIVE(rec, false);
1276    rec->transaction = NULL;
1277    rec->release(REC_HISTORY);
1278    }
(gdb) frame 0 l
No symbol "l" in current context.
(gdb) frame 0
#0  Table::rollbackRecord (this=0x7ffff6d54d48, recordToRollback=0xba37d278,
    transaction=0x7ffec062fba0) at Table.cpp:1075
1075    recordToRollback->printRecord("Table::rollbackRecord
failed");
(gdb) l
1070    if (priorRecord == NULL && priorState == recDeleted)
1071    return;
1072
1073    // The store of this record into the record leaf failed.
No way to recover.
1074
1075    recordToRollback->printRecord("Table::rollbackRecord
failed");
1076    //insert(priorRecord, recordToRollback,
recordToRollback->recordNumber);
1077    ASSERT(false);
1078    }
1079
(gdb) help jump

[...]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffec7082950 (LWP 18918)]
0x00000000008e94bb in Table::rollbackRecord (this=0x957caf,
    recordToRollback=0x7ffe00000001, transaction=0x95a783) at Table.cpp:1068
1068    if (!insertIntoTree(priorRecord, recordToRollback,
recordToRollback->recordNumber))
(gdb) c

How to repeat:
Run DBT2. Error will occur within seconds.

Suggested fix:
The problem appears to be caused by backlogging a locked record.

The backlogger calls Table::insertIntoTree() to replace the base record of the backlogged record with NULL in the record tree. It is possible that the base record was replaced by a lock record during this process.

Locked record chains should never be removed from the record tree, and should thus never be backlogged.

When backlogging was disabled, the problem did not occur.
[9 Mar 2009 12:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/68632

3057 Christopher Powers	2009-03-09
      Bug #43504 "Falcon DBT2 crash in Table::rollbackRecord()"
      
      It is possible that the backlogger may remove a locked record
      chain from the record tree. Disable backlogging until
      this is corrected.
      
      Also made a minor, somewhat unrelated fix in Table::allocRecord()
      and Table::allocRecordVersion().
[19 Mar 2009 23:52] Christopher Powers
Decoupling record metadata from the record cache (e.g. Bug#32838) obviates backlogging in most--but not all--cases. Backlogging will remain disabled for now, possibly even for beta.

Backlogging issue are tracked in Bug#39940, so this bug can be closed.
[30 Mar 2009 13:15] Kevin Lewis
It was shown on Kelly's DBT2 run that backlogging was causing the condition in this bug.  Now, backlogging has been turned off, the transactionState is separate from Transactions, and there is a CycleManager protecting doomed Record objects from disappearing while threads have pointers to it.  Please open a new bug with the current call stack and new steps to reproduce.  Refer back to this bug if the stacks are similar.
[2 Apr 2009 17:38] Bugs System
Pushed into 6.0.11-alpha (revid:hky@sun.com-20090402144811-yc5kp8g0rjnhz7vy) (version source revid:christopher.powers@sun.com-20090309121214-svzrpkx27gd6l9mu) (merge vers: 6.0.11-alpha) (pib:6)
[15 May 2009 16:13] MC Brown
Internal/test fix. No changelog entry required.