Bug #42828 Falcon crash in Table::checkUniqueRecordVersion - transaction = 0x0
Submitted: 13 Feb 2009 12:40 Modified: 15 May 2009 15:10
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0-falcon-team OS:Any
Assigned to: Kevin Lewis CPU Architecture:Any
Tags: F_RECORD TREE

[13 Feb 2009 12:40] Philip Stoev
Description:
When executing the falcon_recovery RQG test, Falcon crashed as follows:

#4  <signal handler called>
#5  0x0856af1f in interlockedIncrement (ptr=0x2e0) at Interlock.h:287
#6  0x0857d432 in Transaction::addRef (this=0x0) at Transaction.cpp:1033
#7  0x08574e8a in Table::checkUniqueRecordVersion (this=0xb727ed40, 
    recordNumber=4347, index=0xb6f6ebb8, transaction=0x94a2c840, 
    record=0x95036f88, syncUnique=0x977819f8) at Table.cpp:2695
#8  0x08575023 in Table::checkUniqueIndex (this=0xb727ed40, index=0xb6f6ebb8, 
    transaction=0x94a2c840, record=0x95036f88, sync=0x977819f8)
    at Table.cpp:2512
#9  0x085751b0 in Table::insertIndexes (this=0xb727ed40, 
    transaction=0x94a2c840, record=0x95036f88) at Table.cpp:1271
#10 0x085753b1 in Table::insert (this=0xb727ed40, transaction=0x94a2c840, 
    stream=0xa5a9d62c) at Table.cpp:3059
#11 0x0855c157 in StorageDatabase::insert (this=0xb6f1c158, 
    connection=0xb6f46388, table=0xb727ed40, stream=0xa5a9d62c)
    at StorageDatabase.cpp:266
#12 0x085622f0 in StorageTable::insert (this=0xa5a98100)
    at StorageTable.cpp:109
#13 0x0854efdf in StorageInterface::write_row (this=0xb6ecaa8, 
    buff=0xb6ece90 "ÿ", ' ' <repeats 199 times>...) at ha_falcon.cpp:1164
#14 0x083fc9c7 in handler::ha_write_row (this=0xb6ecaa8, 
    buf=0xb6ece90 "ÿ", ' ' <repeats 199 times>...) at handler.cc:5493
#15 0x0836caae in write_record (thd=0xb5de330, table=0xb6ee268, info=0xb5bf878)
    at sql_insert.cc:1610
#16 0x0836ce7b in select_insert::send_data (this=0xb5bf858, values=@0xb82d958)
    at sql_insert.cc:3164
#17 0x08335b6d in end_send (join=0xb828be0, join_tab=0xb5c0168, 
    end_of_records=false) at sql_select.cc:17211
#18 0x0833ee28 in evaluate_join_record (join=0xb828be0, join_tab=0xb5bffb8, 
    error=0) at sql_select.cc:16499
#19 0x0833f208 in sub_select (join=0xb828be0, join_tab=0xb5bffb8, 
    end_of_records=false) at sql_select.cc:16266
#20 0x0834b6a9 in do_select (join=0xb828be0, fields=0xb82d958, table=0x0, 
    procedure=0x0) at sql_select.cc:15795
#21 0x0836606a in JOIN::exec (this=0xb828be0) at sql_select.cc:2882
#22 0x08360c2f in mysql_select (thd=0xb5de330, rref_pointer_array=0xb5df768, 
    tables=0xb5bf3b0, wild_num=0, fields=@0xb5df6f8, conds=0x0, og_num=0, 
    order=0x0, group=0x0, having=0x0, proc_param=0x0, 
    select_options=3490466304, result=0xb5bf858, unit=0xb5df3cc, 
    select_lex=0xb5df664) at sql_select.cc:3063
#23 0x0836637a in handle_select (thd=0xb5de330, lex=0xb5df370, 
    result=0xb5bf858, setup_tables_done_option=1073741824) at sql_select.cc:315
#24 0x082cf3d1 in mysql_execute_command (thd=0xb5de330) at sql_parse.cc:3179
#25 0x082d4a01 in mysql_parse (thd=0xb5de330, 
    inBuf=0xb5beb50 "INSERT INTO `table100_falcon_int_autoinc` ( `int` ) SELECT `int` FROM `table100_falcon_int_autoinc` LIMIT 60", length=108, 
    found_semicolon=0x97782e80) at sql_parse.cc:5751
#26 0x082d599d in dispatch_command (command=COM_QUERY, thd=0xb5de330, 
    packet=0xb5b2c91 "INSERT INTO `table100_falcon_int_autoinc` ( `int` ) SELECT `int` FROM `table100_falcon_int_autoinc` LIMIT 60", packet_length=108)
    at sql_parse.cc:1009
#27 0x082d6c68 in do_command (thd=0xb5de330) at sql_parse.cc:691
#28 0x082c3e3f in handle_one_connection (arg=0xb5de330) at sql_connect.cc:1146
#29 0x0089a45b in start_thread () from /lib/libpthread.so.0
#30 0x007f1c4e in clone () from /lib/libc.so.6

How to repeat:
This crash appears to be rare.

push:

vvaintroub@mysql... 
2009-02-12 00:05:11 

tree:

6.0-falcon-team
[14 Feb 2009 17:02] Kevin Lewis
The code that fails is this;

if (state == Active)
	{
	// This pending record is not a duplicate but an older version is.
	// Only wait on this record if the duplicate is visible or pending
	// at a savepoint.

	if (!activeTransaction)
		{
		activeTransaction = dup->getTransaction();
		activeTransaction->addRef();  <== crashed here
		}

The background for this code is that we had found a duplicate key value in scanIndex so we know that at least one of the record versions contain a duplicate.  We search the base record and each prior record to fins any and all possible duplicate records and see if they are visible or not.  If they are pending, we should wait for that other transaction to complete.  Here, we found a record version that is active but not a duplicate.  We must need to note the activeTransaction before looking further.  If we later find a duplicate that would be visible if this transaction completed, then we must wait on that active transaction.

What happened here is that sometime between the instant that getRelativeState() was called to find that the dup->transaction was still pending and the current instant, the transaction was detached from that recordVersion.  The transaction was probably rolled back since that happens a lot quicker than a commit, writeComplete, and scavenge.

It looks like this code can just safely keep looking for dups if the transaction is returned null here.
[14 Feb 2009 17:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/66348

3018 Kevin Lewis	2009-02-14
      Bug#42828 - The background for this code is that we had found a duplicate key value in scanIndex so we know that at least one of the record versions contain a duplicate.  We search the base record and each prior record to find any and all possible duplicate records and see if
      they are visible or not.  If they are pending, we should wait for that other transaction to complete.  Here, we found a record version that is active but not a duplicate.  We must need to note the activeTransaction before looking further.  If we later find a duplicate that would be visible if this transaction completed, then we must wait on that active transaction.
      
      What happened here is that sometime between the instant that getRelativeState() was called to find that the dup->transaction was still pending and the current instant, the transaction was detached from that recordVersion.  The transaction was probably rolled back since that happens a lot quicker than a commit, writeComplete, and scavenge.
      
      This code can just safely keep looking for dups if the transaction is returned null here.
[2 Mar 2009 14:12] Bugs System
Pushed into 6.0.11-alpha (revid:alik@sun.com-20090302140208-lfdejjbcyezlhhjt) (version source revid:vvaintroub@mysql.com-20090214213452-djoygjmxbmbgp8hj) (merge vers: 6.0.10-alpha) (pib:6)
[15 May 2009 15:10] MC Brown
Internal/test fix. No changelog entry required.