Bug #43145 Falcon crash in Record::getEncoding
Submitted: 24 Feb 2009 13:18 Modified: 15 May 2009 13:27
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0-falcon-team OS:Any
Assigned to: Kevin Lewis CPU Architecture:Any
Tags: F_CHILL THAW

[24 Feb 2009 13:18] Philip Stoev
Description:
When executing a RQG workload, Falcon crashed as follows. The crash was observed only once out of hundreds of test runs.

#2  0x00000000006bd5bb in handle_segfault (sig=11) at mysqld.cc:2689
#3  <signal handler called>
#4  0x0000000000a1a344 in Record::getEncoding (this=0x2aaac0c4a5b8, index=1)
    at Record.cpp:781
#5  0x0000000000988282 in StorageTable::getEncoding (this=0x2aaaaadcf378,
    fieldIndex=1) at StorageTable.cpp:385
#6  0x000000000096d66a in StorageInterface::encodeRecord (this=0x46a6848,
    buf=0x46a7008 "О©ҐО©ҐО©Ґ0", ' ' <repeats 196 times>..., updateFlag=true)
    at ha_falcon.cpp:2976
#7  0x0000000000973022 in StorageInterface::update_row (this=0x46a6848,
    oldData=0x46a7578 "О©ҐО©ҐО©Ґ", ' ' <repeats 197 times>...,
    newData=0x46a7008 "О©ҐО©ҐО©Ґ0", ' ' <repeats 196 times>...) at ha_falcon.cpp:1216
#8  0x0000000000814968 in handler::ha_update_row (this=0x46a6848,
    old_data=0x46a7578 "О©ҐО©ҐО©Ґ", ' ' <repeats 197 times>...,
    new_data=0x46a7008 "О©ҐО©ҐО©Ґ0", ' ' <repeats 196 times>...) at handler.cc:5517
#9  0x00000000008232cd in ha_partition::update_row (this=0x3e79578,
    old_data=0x46a7578 "О©ҐО©ҐО©Ґ", ' ' <repeats 197 times>...,
    new_data=0x46a7008 "О©ҐО©ҐО©Ґ0", ' ' <repeats 196 times>...)
    at ha_partition.cc:3093
#10 0x0000000000814968 in handler::ha_update_row (this=0x3e79578,
    old_data=0x46a7578 "О©ҐО©ҐО©Ґ", ' ' <repeats 197 times>...,
    new_data=0x46a7008 "О©ҐО©ҐО©Ґ0", ' ' <repeats 196 times>...) at handler.cc:5517
#11 0x0000000000794076 in mysql_update (thd=0x4382110, table_list=0x4141458,
    fields=@0x43840e0, values=@0x43844e8, conds=0x4141e48, order_num=0,
    order=0x0, limit=18446744073709551592, handle_duplicates=DUP_ERROR,
    ignore=false) at sql_update.cc:651
#12 0x00000000006d11e3 in mysql_execute_command (thd=0x4382110)
    at sql_parse.cc:3014
#13 0x00000000006d6af9 in mysql_parse (thd=0x4382110,
    inBuf=0x4141198 "UPDATE `table10000_falcon_key_pk_parts_2_int_autoinc` SET `char_64_latin1_not_null_key` = 0 WHERE `int_unsigned_not_null_key` >= 'as'",
    length=133, found_semicolon=0x4dae4f20) at sql_parse.cc:5752
#14 0x00000000006d7c8b in dispatch_command (command=COM_QUERY, thd=0x4382110,
    packet=0x3f219a1 "UPDATE `table10000_falcon_key_pk_parts_2_int_autoinc` SET `char_64_latin1_not_null_key` = 0 WHERE `int_unsigned_not_null_key` >= 'as'  $
#15 0x00000000006d911e in do_command (thd=0x4382110) at sql_parse.cc:691
#16 0x00000000006c65da in handle_one_connection (arg=0x4382110)
    at sql_connect.cc:1146
#17 0x0000003587a062f7 in start_thread () from /lib64/libpthread.so.0
#18 0x0000003586ed1b6d in clone () from /lib64/libc.so.6

(gdb) list
776                                             vector[++highWater] = (USHORT) (q - (UCHAR*) data.record);
777                                             p = q;
778                                             }
779                                     }
780
781                             return (UCHAR*) data.record + vector[index]; <<<<<< HERE
782                             }
783
784                     default:
785                             NOT_YET_IMPLEMENTED;

(gdb) print data
$1 = {record = 0x0}
(gdb) print vector
$2 = (USHORT *) 0x0

How to repeat:
perl runall.pl   --rows=10000 --threads=32 --mysqld=--falcon-record-chill-threshold=1    --basedir=/export/home/pb2/test/sb_1-342525-1235379243.16/mysql-6.0.10-alpha-linux-x86_64-test   --mask=1755   --queries=10000000   --duration=900   --engine=Falcon   --grammar=conf/combinations.yy   --gendata=conf/combinations.zz   --reporters=Deadlock,ErrorLog,Backtrace,Recovery   --mysqld=--loose-falcon-lock-wait-timeout=1   --mysqld=--loose-innodb-lock-wait-timeout=1   --mysqld=--log-output=none   --mysqld=--skip-safemalloc
[6 Apr 2009 11:03] Philip Stoev
To reproduce:

$ perl runall.pl \
  --engine=Falcon \
  --reporters=ErrorLog,Backtrace \
  --mysqld=--loose-falcon-lock-wait-timeout=1 \
  --mysqld=--loose-innodb-lock-wait-timeout=1 \
  --mysqld=--log-output=none \
  --mysqld=--skip-safemalloc \
  --mysqld=--transaction-isolation=REPEATABLE-READ \
  --mysqld=--falcon-page-size=4K \
  --mem \
  --rows=1000 \
  --threads=4 \
  --mysqld=--falcon-record-chill-threshold=1 \
  --mask=41567 \
  --queries=100000000 \
  --duration=900 \
  --basedir=/build/bzr/6.0-falcon-team \
  --gendata=conf/combinations.zz \
  --grammar=conf/combinations.yy

This is a chill/thaw bug.
[8 Apr 2009 17:46] Kevin Lewis
This bug shows that a record pointed to by StorageTable::record suddenly has its data.record pointer set to 0x00.  So this is a Chill/Thaw bug.
[9 Apr 2009 15:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71801

3115 Kevin Lewis	2009-04-09
      Bug#44161 & Bug#43145 - These two bugs are two different things that can happen when a data buffer suddenly disapears after being chilled to the serial log while another thread is following that pointer.  To prevent this, data.record pointers within Record objects are now sent to purgatory for the CycleManager to release instead of being freed immediately.  
      
      Two new purgatory lists are added to the CycleManager since the data buffer can come in two forms; an array of Value objects and a char string.  Each object in purgatory is now called a zombie.  Record::deleteData is expanded to delete immediately when called from the Record destructor, or queueForDelete when called from anywhere else, such as SRLUpdateRecords::chill or a failed Table::update.
[15 Apr 2009 16:59] Bugs System
Pushed into 6.0.11-alpha (revid:hky@sun.com-20090415164923-9arx29ye5pzxd4zf) (version source revid:vvaintroub@mysql.com-20090409192125-djnm4marpv4osrcw) (merge vers: 6.0.11-alpha) (pib:6)
[15 May 2009 13:27] MC Brown
Internal/test fix only. No changelog entry required.