Bug #27277 | Falcon: hang after many insertions of the same values | ||
---|---|---|---|
Submitted: | 19 Mar 2007 23:28 | Modified: | 15 Jun 2007 8:30 |
Reporter: | Peter Gulutzan | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S1 (Critical) |
Version: | 5.2.4-falcon-alpha-debug | OS: | Linux (SUSE 10.0 / 64-bit) |
Assigned to: | Christopher Powers | CPU Architecture: | Any |
[19 Mar 2007 23:28]
Peter Gulutzan
[19 Mar 2007 23:36]
MySQL Verification Team
Thank you for the bug report. Verified as described on FC 6.0 32-bit.
[14 Apr 2007 2:19]
Hakan Küçükyılmaz
Now it crashes: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1082665312 (LWP 5893)] Transaction::thaw (this=0x0, record=0x2aaaafa79498) at Transaction.cpp:392 392 SerialLogControl control(database->dbb->serialLog); (gdb) bt #0 Transaction::thaw (this=0x0, record=0x2aaaafa79498) at Transaction.cpp:392 #1 0x00000000007e716b in RecordVersion::thaw (this=0x2aaaafa79498) at RecordVersion.cpp:258 #2 0x00000000007e740b in RecordVersion::fetchVersion (this=0x0, trans=0x2aaaafa79498) at RecordVersion.cpp:95 #3 0x0000000000790683 in StorageDatabase::nextRow (this=<value optimized out>, storageTable=0x2aaaaafa8b88, recordNumber=25705) at StorageDatabase.cpp:253 #4 0x00000000007935e9 in StorageTable::next (this=0x2aaaaafa8b88, recordNumber=0, lockForUpdate=false) at StorageTable.cpp:122 #5 0x000000000078daeb in NfsStorageTable::rnd_next (this=0xecb6b8, buf=0xecb8e8 "�\200") at ha_falcon.cpp:467 #6 0x00000000006dfbf9 in rr_sequential (info=0x4087fc90) at records.cc:363 #7 0x00000000006929ec in mysql_update (thd=0xe4d028, table_list=0xeac498, fields=@0xeb20a8, values=@0xeb2458, conds=0x0, order_num=0, order=0x0, limit=<value optimized out>, handle_duplicates=DUP_ERROR, ignore=false) at sql_update.cc:466 #8 0x0000000000617780 in mysql_execute_command (thd=0xe4d028) at sql_parse.cc:2536 #9 0x000000000075c253 in sp_instr_stmt::exec_core (this=0xeacd48, thd=0x2aaaafa79498, nextp=0x0) at sp_head.cc:2609 #10 0x000000000075c44e in sp_lex_keeper::reset_lex_and_exec_core (this=0xeacd88, thd=0xe4d028, nextp=0x40880d04, open_tables=false, instr=0xeacd48) at sp_head.cc:2455 #11 0x0000000000762499 in sp_instr_stmt::execute (this=0xeacd48, thd=0xe4d028, nextp=0x40880d04) at sp_head.cc:2560 #12 0x000000000075fda0 in sp_head::execute (this=0xe872c8, thd=0xe4d028) at sp_head.cc:1077 #13 0x00000000007606ef in sp_head::execute_procedure (this=0xe872c8, thd=0xe4d028, args=0xe4ee68) at sp_head.cc:1726 #14 0x0000000000619eb2 in mysql_execute_command (thd=0xe4d028) at sql_parse.cc:3807 #15 0x000000000061c02f in mysql_parse (thd=0xe4d028, inBuf=0xe7bef8 "call p20()", length=<value optimized out>) at sql_parse.cc:5274 #16 0x000000000061cc0b in dispatch_command (command=COM_QUERY, thd=0xe4d028, packet=<value optimized out>, packet_length=11) at sql_parse.cc:896 #17 0x000000000061dafc in do_command (thd=0xe4d028) at sql_parse.cc:662 #18 0x000000000060f77e in handle_one_connection (arg=<value optimized out>) at sql_connect.cc:1089 #19 0x00002b9ed679df1a in start_thread () from /lib/libpthread.so.0 #20 0x00002b9ed6e52602 in clone () from /lib/libc.so.6 #21 0x0000000000000000 in ?? () Regards, Hakan
[14 Apr 2007 2:33]
Hakan Küçükyılmaz
I hit the crash when running in isolation READ-COMMITTED
[16 Apr 2007 15:03]
Kevin Lewis
Chris, assigning this to you. This crash is in RecordVersion::thaw because the transaction pointer is null. This would be the transaction for the chilled RecordVersion, not the transaction that is doing the StorageDatabase::nextRow. I think that our solution for thawing from the data page after writeComplete only works when the isolation level is RepeatableRead. This isolation level is ReadCommitted. This does not work in RecordVersion::fetchVersion() if (trans->isolationLevel == TRANSACTION_READ_COMMITTED) { if (!transaction || transaction->state == Committed || transaction == trans) { if (state == recChilled) thaw(); If the transaction is null and the record is chilled, we can't thaw it through the transaction. We need to figure out if this scenario is OK, and if so, thaw the record from the current data page without using the transaction of the RecordVersion, which is long gone.
[20 Apr 2007 0:46]
Christopher Powers
The fatal exception is a secondary issue caused by RecordVersion::thaw() attempting to restore record data from the Serial Log without ensuring that the associated transaction is valid. This was corrected by the fix for bug #27951 "Falcon crashes with insert into..select from". The primary issue is unrelated to the chill/thaw mechanism. The Scavenger thread holds an exclusive table lock (issued by Table::retireRecords) that blocks other threads to the point of stalling, such as a Table::fetch() emanating from a SELECT COUNT(*). In this case, the Scavenger spends significant time in Index::garbageCollect(). Perhaps the exclusive lock on the table can be held intermittently, thus allowing access by other threads. One idea might be for retireRecords() to hold the exclusive lock for each record group or for each bin of the hash table rather than for all records.
[8 Jun 2007 1:47]
Christopher Powers
No longer able to reproduce the failure--probably fixed by recent changes to record scavenging. Will try to identify applicable code changes.
[8 Jun 2007 8:02]
Hakan Küçükyılmaz
Can't reproduce with latest Falcon using isolation level of READ-COMMITTED or REPEATABLE-READ. +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (1 min 25.62 sec) +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (1 min 25.89 sec) +------+ | drop | +------+ | drop | +------+ 1 row in set (1 min 25.89 sec) Query OK, 0 rows affected (1 min 26.41 sec) Best regards, Hakan
[8 Jun 2007 14:53]
Peter Gulutzan
I confirm that I can no longer reproduce the behaviour.
[15 Jun 2007 8:30]
MC Brown
A note has been added to the 6.0.1 changelog.