Bug #39694 Crash in StorageTable::setRecord during falcon_chill_thaw test
Submitted: 27 Sep 2008 14:53 Modified: 13 Dec 2008 9:03
Reporter: John Embretsen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0-falcon-team (6.0.8a) OS:Any
Assigned to: Christopher Powers CPU Architecture:Any
Tags: F_CHILL THAW, pb2, pushbuild, test failure
Triage: Triaged: D1 (Critical)

[27 Sep 2008 14:53] John Embretsen
Description:
From falcon_chill_thaw test output ("Pushbuild 2", Random Query Generator) for revid john.embretsen@sun.com-20080927075218-hwawm8ctbebpgyrc (mysql-6.0-falcon-team branch):

# 15:21:18 Query: UPDATE C AS X SET `int_key` = 45 WHERE X . `int_nokey` < 157 failed: 1205 Lock wait timeout exceeded; try restarting transaction
# 15:21:19 Query: UPDATE A AS X SET `int_key` = 57 WHERE X . `pk` > 82 failed: 2013 Lost connection to MySQL server during query

From error log:
(...)
Table::fetchForUpdate: Update Conflict: TransId=2126, RecordNumber=0, Table TEST.C
Table::fetchForUpdate: Update Conflict: TransId=2122, RecordNumber=0, Table TEST.E
15: Record thaw (fetch): table 28, 75 records, 4098 bytes
15: Record thaw (fetch): table 28, 75 records, 4126 bytes
15: Record thaw (fetch): table 28, 76 records, 4126 bytes
15: Record thaw (fetch): table 28, 76 records, 4156 bytes
15: Record thaw (fetch): table 28, 73 records, 4112 bytes
15: Record thaw (fetch): table 28, 73 records, 4102 bytes
080927 16:21:19 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=11
max_threads=151
thread_count=11
connection_count=11
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 60039 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0xa5fa028
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0xa7e7dfa8 thread_stack 0x30c00
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(my_print_stacktrace+0x26) [0x87e9197]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(handle_segfault+0x2cd) [0x82ae5a5]
[0x985420]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(StorageTable::setRecord(Record*, bool)+0x8b) [0x8539a59]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(StorageDatabase::nextRow(StorageTable*, int, bool)+0x135) [0x8534007]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(StorageTable::next(int, bool)+0x3a) [0x853a024]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(StorageInterface::rnd_next(unsigned char*)+0x92) [0x8525398]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(rr_sequential(READ_RECORD*)+0x71) [0x83d3207]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(mysql_update(THD*, TABLE_LIST*, List<Item>&, List<Item>&, Item*, unsigned int, st_order*, unsigned long long, enum_duplicates, bool)+0x18d7) [0x836a19f]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(mysql_execute_command(THD*)+0x2ba9) [0x82c1713]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(mysql_parse(THD*, char const*, unsigned int, char const**)+0x22f) [0x82c7565]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x87c) [0x82c8520]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(do_command(THD*)+0x23b) [0x82c97eb]
/export/home/pb2/test/sb-41711-1222521649.71/mysql-6.0.8-alpha-linux-i686-test/libexec/mysqld(handle_one_connection+0x11d) [0x82b71ad]
/lib/libpthread.so.0 [0x89a45b]
/lib/libc.so.6(clone+0x5e) [0x7f1c4e]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0xa6c04bd8 is an invalid pointer
thd->thread_id=14
thd->killed=NOT_KILLED

How to repeat:
Branch or checkout the mysql-test-extra-6.0 bazaar tree and run:

$ cd mysql-test/gentest
$ ./runall.pl \
 --basedir=/path/to/mysql-6.0-falcon-team \
 --reporters=Deadlock,ErrorLog,Backtrace,Recovery,WinPackage \
 --mysqld=--loose-falcon-lock-wait-timeout=1 \
 --engine=Falcon \
 --grammar=conf/falcon_chill_thaw.yy \ 
 --queries=100000 \
 --mysqld=--falcon-index-chill-threshold=4K \ 
 --mysqld=--falcon-record-chill-threshold=4K \ 
 --duration=1200
[27 Sep 2008 15:01] John Embretsen
Attached full test output (log) from failing test case.

Attachment: falcon_online_alter_log_bug39694.txt (text/plain), 66.25 KiB.

[27 Sep 2008 15:17] John Embretsen
Test name in previous two comments (as well as log file name) should be "falcon_chill_thaw", not "falcon_online_alter". Sorry for the mistake.
[27 Sep 2008 15:40] John Embretsen
gdb stack trace indicates that the crash happened in Record::getEncodedRecord (line 743).

Program terminated with signal 11, Segmentation fault.
#0  0x00985402 in __kernel_vsyscall ()
#0  0x00985402 in __kernel_vsyscall ()
#1  0x0089f067 in pthread_kill () from /lib/libpthread.so.0
#2  0x087e9256 in my_write_core (sig=11) at stacktrace.c:307
#3  0x082ae7d2 in handle_segfault (sig=11) at mysqld.cc:2660
#4  <signal handler called>
#5  0x085bf79c in Record::getEncodedRecord (this=0xa6274dc0) at Record.cpp:743
#6  0x08539a59 in StorageTable::setRecord (this=0xa662ca78, 
    newRecord=0xa6274dc0, locked=true) at StorageTable.cpp:348
#7  0x08534007 in StorageDatabase::nextRow (this=0xb7067130, 
    storageTable=0xa662ca78, recordNumber=1027, lockForUpdate=true)
    at StorageDatabase.cpp:306
#8  0x0853a024 in StorageTable::next (this=0xa662ca78, recordNumber=1027, 
    lockForUpdate=true) at StorageTable.cpp:158
#9  0x08525398 in StorageInterface::rnd_next (this=0xa76ebc0, buf=0xa76ed88 "")
    at ha_falcon.cpp:613
#10 0x083d3207 in rr_sequential (info=0xa7e7cf7c) at records.cc:390
#11 0x0836a19f in mysql_update (thd=0xa5fa028, table_list=0xa6c04f00, 
    fields=@0xa5fb5e0, values=@0xa5fb808, conds=0xa6c055a8, order_num=0, 
    order=0x0, limit=18446744073709551344, handle_duplicates=DUP_ERROR, 
    ignore=false) at sql_update.cc:573
#12 0x082c1713 in mysql_execute_command (thd=0xa5fa028) at sql_parse.cc:2865
#13 0x082c7565 in mysql_parse (thd=0xa5fa028, 
    inBuf=0xa6c04bd8 "UPDATE E AS X SET `int_key` = 130 WHERE X . `int_nokey` < 207", length=61, found_semicolon=0xa7e7de80) at sql_parse.cc:5578
#14 0x082c8520 in dispatch_command (command=COM_QUERY, thd=0xa5fa028, 
    packet=0xa30b371 "UPDATE E AS X SET `int_key` = 130 WHERE X . `int_nokey` < 207", packet_length=61) at sql_parse.cc:1001
#15 0x082c97eb in do_command (thd=0xa5fa028) at sql_parse.cc:688
#16 0x082b71ad in handle_one_connection (arg=0xa5fa028) at sql_connect.cc:1153
#17 0x0089a45b in start_thread () from /lib/libpthread.so.0
#18 0x007f1c4e in clone () from /lib/libc.so.6
Using host libthread_db library "/lib/libthread_db.so.1".

This also occurs on Solaris 10 32-bit SPARC platform in Pushbuild 2.
[24 Oct 2008 5:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56961

2880 Christopher Powers	2008-10-24
      Bug#36631,"Assertion in SerialLogControl::nextRecord"
      Bug#38541,"Falcon RecordVersion::thaw assertion """"bytesRestored > 0 || data.record == NULL"""
      Bug#38567,"Falcon crash in  Record::getEncodedRecord at Record.cpp:743"
      Bug#38569,"Falcon assertion in SRLUpdateIndex::thaw : type == srlUpdateIndex"
      Bug#39694,"Crash in StorageTable::setRecord during falcon_chill_thaw test"
      Bug#39695,"Crash in SRLUpdateRecords::thaw during falcon_chill_thaw test"
      Bug#39696,"Assertion in Table.cpp (dup->state == recDeleted) fails during falcon_chill_thaw"
      
      Fix 1 of 3 for chill/thaw bugs.
      
      Modified Record::hasRecord() to thaw if necessary, removed redundant record thaws, added exception handling
[24 Oct 2008 5:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56962

2881 Christopher Powers	2008-10-24
      Bug#36631,"Assertion in SerialLogControl::nextRecord"
      Bug#38541,"Falcon RecordVersion::thaw assertion """"bytesRestored > 0 || data.record == NULL"""
      Bug#38567,"Falcon crash in  Record::getEncodedRecord at Record.cpp:743"
      Bug#38569,"Falcon assertion in SRLUpdateIndex::thaw : type == srlUpdateIndex"
      Bug#39694,"Crash in StorageTable::setRecord during falcon_chill_thaw test"
      Bug#39695,"Crash in SRLUpdateRecords::thaw during falcon_chill_thaw test"
      Bug#39696,"Assertion in Table.cpp (dup->state == recDeleted) fails during falcon_chill_thaw"
      
      Fix 2 of 3 for chill/thaw bugs.
      
      Synchronized record thaw operations using a pool of sync objects in the associated table.
      Record chill operations do not need to be synchronized because chills are performed in one place, and only by the active Transaction.
[26 Oct 2008 15:55] Kevin Lewis
Two patches approved to problems with concurent thaws.
1) enhance hasRecord to thaw when needed.
2) serialize thaws with a hash table of SyncOjects.
[28 Oct 2008 8:10] Bugs System
Pushed into 6.0.8-alpha  (revid:cpowers@mysql.com-20081024053538-6y1ma8lydl93oxza) (version source revid:alik@mysql.com-20081028074021-o49ijs3x4pxj1gbm) (pib:5)
[13 Dec 2008 9:03] MC Brown
No documentation needed (internal change only)