Bug #41035 Running iuds6.tst is crashing Falcon
Submitted: 25 Nov 2008 22:17 Modified: 15 May 2009 13:24
Reporter: Hakan Küçükyılmaz Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:mysql-6.0-falcon OS:Linux
Assigned to: Kevin Lewis CPU Architecture:Any
Tags: F_RECORD TREE

[25 Nov 2008 22:17] Hakan Küçükyılmaz
Description:
Running iuds6.tst is crashing Falcon. This time on caneland.

How to repeat:
Use latest mysql-6.0-falcon bzr tree and run iuds6.tst and wait for ~3 hours.

(my_print_stacktrace+0x35)[0xa99915]
(handle_segfault+0x336)[0x6512f6]
/lib64/libpthread.so.0[0x2abe49bc1c00]
(Sync::lock(LockType)+0x42)[0x8544d2]
(Table::garbageCollect(Record*, Record*, Transaction*, bool)+0xbb)[0x85ab0b]
(Table::rollbackRecord(RecordVersion*, Transaction*)+0xa6)[0x8601b6]
(Transaction::rollback()+0x14f)[0x86949f]
(Connection::rollback()+0x1a)[0x88078a]
(StorageConnection::rollback()+0x51)[0x848b31]
(StorageInterface::rollback(handlerton*, THD*, bool)+0x3f)[0x843ecf]
(ha_rollback_trans(THD*, bool)+0xb9)[0x74d499]
(trans_rollback(THD*)+0x5b)[0x7e936b]
(mysql_execute_command(THD*)+0x23c9)[0x662b69]
(mysql_parse(THD*, char const*, unsigned int, char const**)+0x214)[0x666884]
(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x913)[0x6671a3]
(do_command(THD*)+0xc6)[0x667c86]
(handle_one_connection+0xf4)[0x659e64]
/lib64/libpthread.so.0[0x2abe49bba143]
/lib64/libc.so.6(__clone+0x6d)[0x2abe4a1ef8cd]
081126  0:11:51 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=56
max_threads=151
thread_count=53
connection_count=50
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 60613 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x2aaab00caf60
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x49f2e188 thread_stack 0x40000
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x24e7db8 = ROLLBACK
thd->thread_id=61
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
[25 Nov 2008 22:26] Hakan Küçükyılmaz
Output of "thread apply all bt full"

Attachment: gdb.txt.gz (application/x-gzip, text), 43.67 KiB.

[16 Dec 2008 15:45] Sveta Smirnova
Thank you for the report.

I can not repeat described behavior with mysql-6.0-falcon-team tree built today on RedHat 64-bit machine. Please indicate which OS do you run?
[16 Dec 2008 15:49] Kevin Lewis
I think I can verify this based on the call stack.  

The call stack crashed in a call to Sync::lock() on a bad syncPrior pointer.  I think I can see how this could happen.

Table::garbageCollect is called during a Transaction::rollback() for a newly inserted record.  Table::garbageCollect will try to lock one of the syncPrior objects in the syncPriorVersions array.  It chooses this in Table::getSyncPrior() based on the recordNumber.  But there is no provision for a recordNumber of -1.

Newly inserted record numbers get assigned a record number of -1 when they are created in allocRecordVersion(), (assigned in the constructor).  The recordNumber is not added to the record object until Dbb::insertStub() is called.   

If something goes wrong with the insert after the record was created but before insertStub is called, the record will have a recordNumber of -1.  Table::getSyncPrior() will return a bad pointer if recordNumber is -1.

I'm still not certain how this record could have survived the insert to be garbage collected during the transaction::rollback with a recordNumber==-1, but I am pretty sure that fixing Table::getSyncPrior() to watch for negative recordNumbers will fix this bug.
[16 Dec 2008 16:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/61782

2936 Kevin Lewis	2008-12-16
      Bug#41035 - It is possible for record objects to contain a 
      recordNumber == -1.  Table::getSyncPrior() is not prepared 
      for this and will return a bad pointer.
[16 Dec 2008 20:36] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/61819

2936 Kevin Lewis	2008-12-16
      (no message)
[15 May 2009 13:24] MC Brown
Internal/test fix. No changelog entry required.