Bug #28174 | Falcon concurrent BLOB updates leads to crash | ||
---|---|---|---|
Submitted: | 30 Apr 2007 21:21 | Modified: | 26 Jul 2007 9:22 |
Reporter: | Hakan Küçükyılmaz | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S1 (Critical) |
Version: | OS: | Any | |
Assigned to: | Jim Starkey | CPU Architecture: | Any |
[30 Apr 2007 21:21]
Hakan Küçükyılmaz
[30 Apr 2007 21:24]
Hakan Küçükyılmaz
C program for concurrent blob update with Falcon
Attachment: falcon_concurrent_blob_updates.tar.gz (application/x-gzip, text), 3.62 KiB.
[1 May 2007 0:19]
MySQL Verification Team
Hakan, With latest bk source and fresh tables: [miguel@light 5.1f]$ ./falcon_concurrent_blob_updates -r600 -s4 Running initializations ... Pre-generating 16777216 bytes of random data. About to spawn 2 threads 1 2 Completed spawning new database worker threads. Runtime is 600. Seed value is 4. Isolation level is READ-COMMITTED. Testcase is now running, so watch for error output. Waiting for worker threads to finish ... [miguel@light 5.1f]$ With a lot of the below errors messages: 070430 21:03:42 [ERROR] /home/miguel/dbs/5.1f/libexec/mysqld: Record has changed since last read in table 't1' 070430 21:03:42 [ERROR] /home/miguel/dbs/5.1f/libexec/mysqld: Sort aborted 070430 21:03:53 [ERROR] /home/miguel/dbs/5.1f/libexec/mysqld: Record has changed since last read in table 't1' 070430 21:03:53 [ERROR] /home/miguel/dbs/5.1f/libexec/mysqld: Sort aborted 070430 21:03:55 [ERROR] /home/miguel/dbs/5.1f/libexec/mysqld: Record has changed since last read in table 't1' <CUT> 070430 21:13:04 [ERROR] /home/miguel/dbs/5.1f/libexec/mysqld: Sort aborted 070430 21:13:07 [ERROR] /home/miguel/dbs/5.1f/libexec/mysqld: Record has changed since last read in table 't1' 070430 21:13:07 [ERROR] /home/miguel/dbs/5.1f/libexec/mysqld: Sort aborted 070430 21:13:27 [ERROR] /home/miguel/dbs/5.1f/libexec/mysqld: Record has changed since last read in table 't1' 070430 21:13:27 [ERROR] /home/miguel/dbs/5.1f/libexec/mysqld: Sort aborted [Thread -1441453168 (LWP 1062) exited] [Thread -1309070448 (LWP 1065) exited] Please ret-test with complete new tables.
[1 May 2007 3:56]
Hakan Küçükyılmaz
Miguel, I figuread out how it failed. I was going to start the test in a loop. It fails in it's second iteration: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1082665312 (LWP 11817)] 0x00000000009321f0 in Cache::findBuffer (this=0x2aaaab00cd50, dbb=0x2aaaab04ab00, pageNumber=11810, lockType=Exclusive, sync=0x4087fa40) at Cache.cpp:334 334 for (candidate = bdb; candidate->higher; candidate = candidate->higher->higher) (gdb) p candidate->higher $1 = (class PagePrecedence *) 0x0 To reproduce: SEED=1 while ( [ $? -eq 0 ] ) do ./falcon_concurrent_blob_updates -r660 -s$SEED SEED=$(($SEED + 3)) done
[1 May 2007 12:33]
MySQL Verification Team
Thank you for the feedback.
[16 Jun 2007 12:20]
Hakan Küçükyılmaz
Crashes or hits assertion. Start with fresh installation (no Falcon databases at all). Run one after other without restarting the server ./falcon_concurrent_blob_updates -r660 -s1 ./falcon_concurrent_blob_updates -r660 -s1 ./falcon_concurrent_blob_updates -r660 -s1 I got a crash on my first run. Backtrace is Program received signal SIGILL, Illegal instruction. [Switching to LWP 484] 0x00002b3db3b43fed in ?? () (gdb) bt #0 0x00002b3db3b43fed in ?? () #1 0x00000000007d2622 in Error::debugBreak () at Error.cpp:92 #2 0x00000000007d271d in Error::error (string=<value optimized out>) at Error.cpp:69 #3 0x00000000007d277c in Error::assertionFailed (fileName=0x4 <Address 0x4 out of bounds>, line=484) at Error.cpp:76 #4 0x00000000007f5057 in RecordVersion::thaw (this=0x2b3db5571ad0) at RecordVersion.cpp:300 #5 0x00000000007f52c3 in RecordVersion::fetchVersion (this=0x2b3db5571ad0, trans=0x2b3db53b7f58) at RecordVersion.cpp:106 #6 0x000000000079eef2 in StorageDatabase::nextRow (this=<value optimized out>, storageTable=0x2b3db528e610, recordNumber=6694, lockForUpdate=false) at StorageDatabase.cpp:236 #7 0x00000000007a2821 in StorageTable::next (this=0x0, recordNumber=4, lockForUpdate=255) at StorageTable.cpp:126 #8 0x000000000079bd82 in StorageInterface::rnd_next (this=0xfaeef8, buf=0xfaf130 "����\005") at ha_falcon.cpp:470 #9 0x00000000006e8f93 in rr_sequential (info=0xfc8058) at records.cc:363 #10 0x000000000066731c in join_init_read_record (tab=0xfc7fd8) at sql_select.cc:11480 #11 0x000000000067610b in sub_select (join=0x2aaab00119a8, join_tab=0x1e4, end_of_records=4) at sql_select.cc:10830 #12 0x000000000067cf09 in do_select (join=0x2aaab00119a8, fields=0xf4b120, table=0x0, procedure=0x0) at sql_select.cc:10594 #13 0x0000000000685f56 in JOIN::exec (this=0x2aaab00119a8) at sql_select.cc:2099 #14 0x0000000000686b92 in mysql_select (thd=0xf49618, rref_pointer_array=0xf4b200, tables=0xfc71c0, wild_num=0, fields=<value optimized out>, conds=0xfc7660, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2148813312, result=0xfc7898, unit=0xf4ac50, select_lex=0xf4b018) at sql_select.cc:2275 #15 0x0000000000687094 in handle_select (thd=0xf49618, lex=0xf4abb0, result=0xfc7898, setup_tables_done_option=0) at sql_select.cc:258 On an other run I got an assertion on the second run: ./falcon_concurrent_blob_updates -r660 -s1 ./falcon_concurrent_blob_updates -r660 -s1 Exception: assertion failed at line 560 in file Section.cpp Exception during cleanupRecords of table TEST.T1: assertion failed at line 560 in file Section.cpp Transactions: 3103 committed, 128 rolled back, 0 active, 2 post-commit Activity on falcon_master.fts: 4 fetches, 4 reads, 13 writes Activity on /data/mysql-sap/falcon_user.fts: 218923 fetches, 1239 reads, 23576 writes SerialLog: 0 reads, 1741 writes, 742 transactions, 741 blocks, 2 windows Stalled threads Thread 2ad095010690 (1116502336) sleep=0, grant=0, locks=1, who 0, javaThread 0 prior=cccccccccccccccc, next=cccccccccccccccc, parent=0 Stalled synchronization objects: ------------------------------------ Stalled threads Thread 2ad094fc0088 (1074272576) sleep=0, grant=0, locks=1, who 0, javaThread 0 prior=cccccccccccccccc, next=cccccccccccccccc, parent=0 Pending Section::insertStub state 0 (1) syncObject 0x2ad095010350 Thread 2ad095010690 (1116502336) sleep=1, grant=0, locks=1, who 0, javaThread 0 prior=cccccccccccccccc, next=cccccccccccccccc, parent=0 Stalled synchronization objects: SyncObject 2ad095010350: state -1, monitor 0, waiters 1 Exclusive thread 95010690 (1116502336), type 1; Waiting thread 94fc0088 (1074272576), type 1; Section::insertStub ------------------------------------
[26 Jun 2007 19:01]
Ann Harrison
This looks a lot like 29081 - a crash after failing to find a buffer. Does it still happen?
[26 Jun 2007 19:47]
Hakan Küçükyılmaz
First run of ./falcon_concurrent_blob_updates -r660 -s1 succeeds now without crash but consequent run is not possible due to hang: Exception: assertion failed at line 560 in file Section.cpp Exception during cleanupRecords of table TEST.T1: assertion failed at line 560 in file Section.cpp Transactions: 1198 committed, 48 rolled back, 2 active, 1 post-commit SerialLog: 0 reads, 648 writes, 171 transactions, 612 completed, 171 blocks, 1 windows Activity on falcon_master.fts: 4 fetches, 4 reads, 0 writes Activity on /data/mysql-sap/falcon_user.fts: 60490 fetches, 405 reads, 7386 writes Stalled threads Thread 2b1b63dddc60 (1116502336) sleep=0, grant=0, locks=1, who 0, javaThread 0 prior=cccccccccccccccc, next=cccccccccccccccc, parent=0 Stalled synchronization objects: ------------------------------------ Stalled threads Thread 2b1b63d68088 (1074272576) sleep=0, grant=0, locks=1, who 0, javaThread 0 prior=cccccccccccccccc, next=cccccccccccccccc, parent=0 Stalled synchronization objects: ------------------------------------ Transactions: 0 committed, 0 rolled back, 2 active, 1 post-commit SerialLog: 0 reads, 1 writes, 2 transactions, 0 completed, 3 blocks, 1 windows Activity on falcon_master.fts: 0 fetches, 0 reads, 2 writes Activity on /data/mysql-sap/falcon_user.fts: 6 fetches, 0 reads, 24 writes Transactions: 0 committed, 0 rolled back, 2 active, 1 post-commit SerialLog: 0 reads, 1 writes, 1 transactions, 0 completed, 3 blocks, 1 windows Transactions: 0 committed, 0 rolled back, 2 active, 1 post-commit SerialLog: 0 reads, 1 writes, 0 transactions, 0 completed, 0 blocks, 1 windows Transactions: 0 committed, 0 rolled back, 2 active, 1 post-commit Transactions: 0 committed, 0 rolled back, 2 active, 1 post-commit Transactions: 0 committed, 0 rolled back, 2 active, 1 post-commit Transactions: 0 committed, 0 rolled back, 2 active, 1 post-commit
[26 Jun 2007 20:16]
Hakan Küçükyılmaz
Now tried with default settings of falcon_concurrent_blog_updates. 1) It succeeds, 2) it hangs, 3) or it crashes. Backtrace of a crash: akan@lu0011:~/work/mysql/falcon/falcon_concurrent_blob_updates$ ./falcon_concurrent_blob_updates Running initializations ... Pre-generating 16777216 bytes of random data. About to spawn 2 threads 1 2 Completed spawning new database worker threads. Runtime is 500. Seed value is 303. Isolation level is READ-COMMITTED. Testcase is now running, so watch for error output. Program received signal SIGSEGV, Segmentation fault. [Switching to LWP 19809] 0x00000000007a8515 in Table::expungeRecordVersions (this=0x2b6a0b692660, record=0x1f9bcd8) at Table.cpp:1592 1592 for (Record *rec = prior; rec; rec = rec->getPriorVersion()) (gdb) bt #0 0x00000000007a8515 in Table::expungeRecordVersions (this=0x2b6a0b692660, record=0x1f9bcd8) at Table.cpp:1592 #1 0x00000000007f66d5 in RecordVersion::scavenge (this=0x1f9bcd8, oldestActive=<value optimized out>) at RecordVersion.cpp:150 #2 0x00000000007f44c3 in RecordLeaf::retireRecords (this=<value optimized out>, table=<value optimized out>, base=<value optimized out>, age=-1, oldestActiveTransaction=26014) at RecordLeaf.cpp:126 #3 0x000000000083b581 in RecordGroup::retireRecords (this=<value optimized out>, table=0x2b6a0b692660, base=<value optimized out>, age=-1, oldestActiveTransaction=26014) at RecordGroup.cpp:122 #4 0x000000000083b581 in RecordGroup::retireRecords (this=<value optimized out>, table=0x2b6a0b692660, base=<value optimized out>, age=-1, oldestActiveTransaction=26014) at RecordGroup.cpp:122 #5 0x00000000007a7573 in Table::retireRecords (this=0x2b6a0b692660, age=-1, oldestActiveTransaction=26014) at Table.cpp:1515 #6 0x00000000007a75ec in Table::cleanupRecords (this=0x1fa41d8, threshold=33144024, oldestActiveTransaction=33144024) at Table.cpp:2299 #7 0x00000000007c52f2 in Database::cleanupRecords (this=0x2b6a0b443b98, threshold=-1, oldestActiveTransaction=26014) at Database.cpp:2062 #8 0x00000000007c580f in Database::retireRecords (this=0x2b6a0b443b98) at Database.cpp:1749 #9 0x00000000007c5910 in Database::scavenge (this=0x2b6a0b443b98) at Database.cpp:1651 #10 0x00000000007ff4b3 in Scavenger::scavenge (this=<value optimized out>) at Scavenger.cpp:58 #11 0x00000000007ff4ca in Scavenger::execute (this=0x1fa41d8, scheduler=0x1f9bcd8) at Scavenger.cpp:68 #12 0x000000000080034a in Scheduler::schedule (this=0x2b6a0b64e8a8) at Scheduler.cpp:136 #13 0x0000000000800443 in Scheduler::schedule (lpParameter=0x1fa41d8) at Scheduler.cpp:158 #14 0x00000000007ae54e in Thread::thread (this=0x2b6a0b68b6c8) at Thread.cpp:159 #15 0x00000000007ae6bf in Thread::thread (parameter=0x1fa41d8) at Thread.cpp:138
[27 Jun 2007 17:24]
Jim Starkey
Problem turned out to be a short window in which a SerialLogWindow had an incorrect "currentLength", preventing a successful record "thaw" operation.
[28 Jun 2007 15:53]
Hakan Küçükyılmaz
Running falcon_concurrent_blob_updates succeeds in most cases and sometimes it hangs due to a deadlock. In one of twenty runs I had a crash: ./falcon_concurrent_blob_updates -n5 -r540 Program received signal SIGSEGV, Segmentation fault. [Switching to LWP 9224] 0x00000000007a8515 in Table::expungeRecordVersions (this=0x2b0a8a9d7870, record=0x2aaab113dab8) at Table.cpp:1592 1592 for (Record *rec = prior; rec; rec = rec->getPriorVersion()) (gdb) bt #0 0x00000000007a8515 in Table::expungeRecordVersions (this=0x2b0a8a9d7870, record=0x2aaab113dab8) at Table.cpp:1592 #1 0x00000000007f6725 in RecordVersion::scavenge (this=0x2aaab113dab8, oldestActive=<value optimized out>) at RecordVersion.cpp:150 #2 0x00000000007f44c3 in RecordLeaf::retireRecords (this=<value optimized out>, table=<value optimized out>, base=<value optimized out>, age=-1, oldestActiveTransaction=43309) at RecordLeaf.cpp:126 #3 0x000000000083b641 in RecordGroup::retireRecords (this=<value optimized out>, table=0x2b0a8a9d7870, base=<value optimized out>, age=-1, oldestActiveTransaction=43309) at RecordGroup.cpp:122 #4 0x000000000083b641 in RecordGroup::retireRecords (this=<value optimized out>, table=0x2b0a8a9d7870, base=<value optimized out>, age=-1, oldestActiveTransaction=43309) at RecordGroup.cpp:122 #5 0x00000000007a7573 in Table::retireRecords (this=0x2b0a8a9d7870, age=-1, oldestActiveTransaction=43309) at Table.cpp:1515 #6 0x00000000007a75ec in Table::cleanupRecords (this=0x2aaab113bc58, threshold=-1324098888, oldestActiveTransaction=2970868408) at Table.cpp:2299 #7 0x00000000007c52f2 in Database::cleanupRecords (this=0x2b0a8a787b98, threshold=-1, oldestActiveTransaction=43309) at Database.cpp:2062 #8 0x00000000007c580f in Database::retireRecords (this=0x2b0a8a787b98) at Database.cpp:1749 #9 0x00000000007c5910 in Database::scavenge (this=0x2b0a8a787b98) at Database.cpp:1651 #10 0x00000000007ff503 in Scavenger::scavenge (this=<value optimized out>) at Scavenger.cpp:58 #11 0x00000000007ff51a in Scavenger::execute (this=0x2aaab113bc58, scheduler=0x2aaab113dab8) at Scavenger.cpp:68 #12 0x000000000080039a in Scheduler::schedule (this=0x2b0a8a9928a8) at Scheduler.cpp:136 #13 0x0000000000800493 in Scheduler::schedule (lpParameter=0x2aaab113bc58) at Scheduler.cpp:158 #14 0x00000000007ae54e in Thread::thread (this=0x2b0a8a9cf6c8) at Thread.cpp:159 #15 0x00000000007ae6bf in Thread::thread (parameter=0x2aaab113bc58) at Thread.cpp:138
[6 Jul 2007 13:39]
Hakan Küçükyılmaz
Hits assertion in 5 out of 7 runs of ./falcon_concurrent_blob_updates -r540 -s963 I get two different assertions. 0x00002b774aed4fed in raise () from /lib/libpthread.so.0 (gdb) bt #0 0x00002b774aed4fed in raise () from /lib/libpthread.so.0 #1 0x00000000007fc7da in Error::debugBreak () at Error.cpp:92 #2 0x00000000007fc8d0 in Error::error (string=<value optimized out>) at Error.cpp:69 #3 0x00000000007fc930 in Error::assertionFailed (fileName=0x4 <Address 0x4 out of bounds>, line=20047) at Error.cpp:76 #4 0x000000000082bb1c in Section::deleteLine (this=0x2b774c788900, bdb=0x2b774c590e98, line=<value optimized out>, sectionPageNumber=7, transId=0, locatorPage=0x2aaaaac9e000, locatorLine=98) at Section.cpp:561 #5 0x000000000082cf19 in Section::updateRecord (this=0x2b774c788900, recordNumber=<value optimized out>, stream=0x0, transId=0, earlyWrite=false) at Section.cpp:483 #6 0x00000000007f4e19 in Dbb::updateRecord (this=0x2b774c784620, section=0x4e44, recordId=20047, stream=0x4, transaction=<value optimized out>, earlyWrite=false) at Dbb.cpp:325 and Program received signal SIGILL, Illegal instruction. [Switching to Thread 1116236096 (LWP 20015)] 0x00002ab9d089bfed in raise () from /lib/libpthread.so.0 (gdb) bt #0 0x00002ab9d089bfed in raise () from /lib/libpthread.so.0 #1 0x00000000007fc7da in Error::debugBreak () at Error.cpp:92 #2 0x00000000007fc8d0 in Error::error (string=<value optimized out>) at Error.cpp:69 #3 0x00000000007fc930 in Error::assertionFailed (fileName=0x4 <Address 0x4 out of bounds>, line=20015) at Error.cpp:76 #4 0x000000000082bb1c in Section::deleteLine (this=0x2ab9d214f900, bdb=0x2ab9d1f44b80, line=<value optimized out>, sectionPageNumber=7, transId=0, locatorPage=0x2aaaaabeb000, locatorLine=159) at Section.cpp:561 #5 0x000000000082cf19 in Section::updateRecord (this=0x2ab9d214f900, recordNumber=<value optimized out>, stream=0x0, transId=0, earlyWrite=false) at Section.cpp:483 #6 0x00000000007f4e19 in Dbb::updateRecord (this=0x2ab9d214b5e0, section=0x4e24, recordId=20015, stream=0x4, transaction=<value optimized out>, earlyWrite=false) at Dbb.cpp:325 #7 0x00000000007d0f21 in Table::expungeBlob (this=0x2ab9d2151268, blob=<value optimized out>) at Table.cpp:1717
[7 Jul 2007 14:55]
Hakan Küçükyılmaz
Not fixed. 7 out of 10 runs failed. Hitting always same assertion. Please note that we see same assertion as in Bug#29613. (gdb) f 4 #4 0x000000000082bbec in Section::deleteLine (this=0x2ba197508828, bdb=0x2ba19731d248, line=<value optimized out>, sectionPageNumber=7, transId=0, locatorPage=0x2aaaaaddc000, locatorLine=366) at Section.cpp:561 561 ASSERT(false); (gdb) l 556 return spaceAvailable; 557 } 558 559 for (int n = 0; n <= locatorPage->maxLine; ++n) 560 if (n != locatorLine && locatorPage->elements[n].page == bdb->pageNumber) 561 ASSERT(false); 562 563 if (!dbb->serialLog->recovering) 564 dbb->serialLog->logControl->sectionLine.append(dbb, sectionPageNumber, bdb->pageNumber); 565 (gdb) p n $1 = -1 (gdb) p locatorLine $2 = 366 Program received signal SIGILL, Illegal instruction. [Switching to Thread 1116236096 (LWP 17336)] 0x00002ba195c54fed in raise () from /lib/libpthread.so.0 (gdb) bt #0 0x00002ba195c54fed in raise () from /lib/libpthread.so.0 #1 0x00000000007fc85a in Error::debugBreak () at Error.cpp:92 #2 0x00000000007fc950 in Error::error (string=<value optimized out>) at Error.cpp:69 #3 0x00000000007fc9b0 in Error::assertionFailed (fileName=0x4 <Address 0x4 out of bounds>, line=17336) at Error.cpp:76 #4 0x000000000082bbec in Section::deleteLine (this=0x2ba197508828, bdb=0x2ba19731d248, line=<value optimized out>, sectionPageNumber=7, transId=0, locatorPage=0x2aaaaaddc000, locatorLine=366) at Section.cpp:561 #5 0x000000000082cfe9 in Section::updateRecord (this=0x2ba197508828, recordNumber=<value optimized out>, stream=0x0, transId=0, earlyWrite=false) at Section.cpp:483 #6 0x00000000007f4e99 in Dbb::updateRecord (this=0x2ba197504800, section=0x43ad, recordId=17336, stream=0x4, transaction=<value optimized out>, earlyWrite=false) at Dbb.cpp:325 #7 0x00000000007d0f51 in Table::expungeBlob (this=0x2ba19750a190, blob=<value optimized out>) at Table.cpp:1717 #8 0x00000000007d10c2 in Table::garbageCollect (this=0x2ba19750a190, leaving=0x160de38, staying=0x2ba1975efce0, transaction=<value optimized out>, quiet=<value optimized out>) at Table.cpp:1752 #9 0x00000000007d1159 in Table::expungeRecordVersions (this=0x2ba19750a190, record=0x2ba1975efce0) at Table.cpp:1600 #10 0x000000000081f585 in RecordVersion::scavenge (this=0x4, oldestActive=<value optimized out>) at RecordVersion.cpp:150 #11 0x000000000081d293 in RecordLeaf::retireRecords (this=<value optimized out>, table=<value optimized out>, base=<value optimized out>, age=-1, oldestActiveTransaction=36938) at RecordLeaf.cpp:126 #12 0x0000000000864a91 in RecordGroup::retireRecords (this=<value optimized out>, table=0x2ba19750a190, base=<value optimized out>, age=-1, oldestActiveTransaction=36938) at RecordGroup.cpp:122 #13 0x0000000000864a91 in RecordGroup::retireRecords (this=<value optimized out>, table=0x2ba19750a190, base=<value optimized out>, age=-1, oldestActiveTransaction=36938) at RecordGroup.cpp:122 #14 0x00000000007d0193 in Table::retireRecords (this=0x2ba19750a190, age=-1, oldestActiveTransaction=36938) at Table.cpp:1519
[7 Jul 2007 15:07]
Hakan Küçükyılmaz
Please note that the failure is happening with ./falcon_concurrent_blob_updates -r540 -s963
[9 Jul 2007 9:20]
Hakan Küçükyılmaz
Reproduced it on fluffy. SEED=1 while ( [ $? -eq 0 ] ); do ./falcon_concurrent_blob_updates -r540 -s$SEED; SEED=$(($SEED + 3)); done ... ... ... Completed spawning new database worker threads. Runtime is 540. Seed value is 16. Isolation level is READ-COMMITTED. Testcase is now running, so watch for error output. Waiting for worker threads to finish ... (gdb) f 4 #4 0x00000000008ef475 in ~Record (this=0x2aab1c873000) at Record.cpp:129 129 ASSERT(!active); (gdb) p active $1 = 1 '\001' Backtrace: Program received signal SIGILL, Illegal instruction. [Switching to Thread 1085045088 (LWP 10547)] 0x00002aaaab16643a in raise () from /lib64/tls/libc.so.6 (gdb) bt #0 0x00002aaaab16643a in raise () from /lib64/tls/libc.so.6 #1 0x00000000008b7338 in Error::debugBreak () at Error.cpp:92 #2 0x00000000008b744c in Error::error (string=0xc45508 "assertion failed at line %d in file %s\n") at Error.cpp:69 #3 0x00000000008b74e3 in Error::assertionFailed (fileName=0xc4a7f0 "Record.cpp", line=129) at Error.cpp:76 #4 0x00000000008ef475 in ~Record (this=0x2aab1c873000) at Record.cpp:129 #5 0x00000000008f08cf in ~RecordVersion (this=0x2aab1c873000) at RecordVersion.cpp:65 #6 0x00000000008ee7f6 in Record::release (this=0x2aab1c873000) at Record.cpp:517 #7 0x0000000000880198 in Transaction::commitRecords (this=0x2aaaabb9a7a0) at Transaction.cpp:697 #8 0x0000000000883369 in Transaction::commit (this=0x2aaaabb9a7a0) at Transaction.cpp:288 #9 0x0000000000899d83 in Connection::commit (this=0x2aaafed92fc8) at Connection.cpp:269 #10 0x00000000008623db in StorageConnection::commit (this=0x2aaafec88850) at StorageConnection.cpp:153 #11 0x000000000086683f in StorageHandler::commit (this=0x2aaaab8d6048, mySqlThread=0x1094278) at StorageHandler.cpp:211 #12 0x000000000085ab29 in StorageInterface::commit (hton=0x102d878, thd=0x1094278, all=true) at ha_falcon.cpp:966 #13 0x000000000078bcff in ha_commit_one_phase (thd=0x1094278, all=true) at handler.cc:779 #14 0x000000000078c414 in ha_commit_trans (thd=0x1094278, all=true) at handler.cc:749 #15 0x0000000000677baf in end_trans (thd=0x1094278, completion=COMMIT) at sql_parse.cc:555 #16 0x0000000000684084 in mysql_execute_command (thd=0x1094278) at sql_parse.cc:3557 #17 0x0000000000686e07 in mysql_parse (thd=0x1094278, inBuf=0x2aaaf7502a08 "COMMIT", length=6, found_semicolon=0x40ac6fa0) at sql_parse.cc:5381 #18 0x0000000000687a09 in dispatch_command (command=COM_QUERY, thd=0x1094278, packet=0x2aaaf7507ad9 "COMMIT", packet_length=7) at sql_parse.cc:907 #19 0x0000000000688d8e in do_command (thd=0x1094278) at sql_parse.cc:669 #20 0x000000000067711e in handle_one_connection (arg=0x109e058) at sql_connect.cc:1091 #21 0x00002aaaab46cfa5 in start_thread () from /lib64/tls/libpthread.so.0 #22 0x00002aaaab1f4cb2 in clone () from /lib64/tls/libc.so.6
[13 Jul 2007 9:19]
Hakan Küçükyılmaz
Still hitting assertion Program received signal SIGILL, Illegal instruction. [Switching to Thread 1116236096 (LWP 14686)] 0x00002ae972d77fed in raise () from /lib/libpthread.so.0 (gdb) bt #0 0x00002ae972d77fed in raise () from /lib/libpthread.so.0 #1 0x00000000008e55b4 in Error::debugBreak () at Error.cpp:92 #2 0x00000000008e56c8 in Error::error (string=0xc73288 "assertion failed at line %d in file %s\n") at Error.cpp:69 #3 0x00000000008e575f in Error::assertionFailed (fileName=0xc7aa8e "Section.cpp", line=561) at Error.cpp:76 #4 0x0000000000933568 in Section::deleteLine (this=0x2ae97462b990, bdb=0x2ae97440c220, line=14, sectionPageNumber=7, transId=0, locatorPage=0x2aaaaac02000, locatorLine=167) at Section.cpp:561 #5 0x0000000000935519 in Section::updateRecord (this=0x2ae97462b990, recordNumber=167, stream=0x0, transId=0, earlyWrite=true) at Section.cpp:486 #6 0x00000000008db189 in Dbb::updateRecord (this=0x2ae974627658, section=0x2ae97462b990, recordId=167, stream=0x0, transaction=0x0, earlyWrite=true) at Dbb.cpp:325 #7 0x00000000008a16f2 in Table::expungeBlob (this=0x2ae97462d2f8, blob=0x42885b70) at Table.cpp:1723 #8 0x00000000008a719a in Table::garbageCollect (this=0x2ae97462d2f8, leaving=0x2aaab0512ad8, staying=0x2aaab0512da8, transaction=0x0, quiet=false) at Table.cpp:1758 #9 0x00000000008a7283 in Table::expungeRecordVersions (this=0x2ae97462d2f8, record=0x2aaab0512da8) at Table.cpp:1607 #10 0x000000000091f118 in RecordVersion::scavenge (this=0x2aaab0512da8, oldestActive=33052) at RecordVersion.cpp:150 #11 0x000000000091b535 in RecordLeaf::retireRecords (this=0x2ae9746af468, table=0x2ae97462d2f8, base=186, age=-1, oldestActiveTransaction=33052) at RecordLeaf.cpp:167 #12 0x0000000000989fbd in RecordGroup::retireRecords (this=0x2aaab04081d0, table=0x2ae97462d2f8, base=1, age=-1, oldestActiveTransaction=33052) at RecordGroup.cpp:122 #13 0x0000000000989fbd in RecordGroup::retireRecords (this=0x2aaab0407e80, table=0x2ae97462d2f8, base=0, age=-1, oldestActiveTransaction=33052) at RecordGroup.cpp:122 #14 0x000000000089ff66 in Table::retireRecords (this=0x2ae97462d2f8, age=-1, oldestActiveTransaction=33052) at Table.cpp:1525 #15 0x00000000008a005a in Table::cleanupRecords (this=0x2ae97462d2f8, threshold=-1, oldestActiveTransaction=33052) at Table.cpp:2313
[13 Jul 2007 12:50]
Hakan Küçükyılmaz
Latest version
Attachment: falcon_concurrent_blob_updates.tar.gz (application/x-gzip, text), 3.62 KiB.
[16 Jul 2007 14:32]
Kevin Lewis
This bug has now morphed from the crash in the Record destructor to an ASSERT in Section::deleteLine(). The other problem has apparently been fixed. What Hakan neglected to mention with the previous call stack is that he was able to reproduce this crash with on both fluffy and his laptop by running falcon_concurrent_blob_updates repeatedly it in a loop with a different seed each time. On both machines, the seed value that failed would not cause the assert if run once. So this bug is a timing issue with multiple threads and difficult to reproduce. Hakan ran it on his laptop for 1.5 hours and on Fluffy for 11 iterations. Coincidentaly, this is the same assert that was being hit with the JVER test reported in Bug#29613. For that bug, JVer was run serveral times before the assert was hit, and the bug was closed because the problem apparently went away with a patch tested on July 12. Considering the difficulty in reproducing this bug, and the beta timeframe, we should consider trying to fix this after beta.
[16 Jul 2007 15:02]
Hakan Küçükyılmaz
For the record: It needs one iteration with seed value of 963 on my laptop and eleven itarations starting with seed value of 1 and incrementing by 3 on fluffy.
[25 Jul 2007 10:58]
Hakan Küçükyılmaz
Original crash does not happen anymore. New failure will be tracked in new bug report.
[26 Jul 2007 9:22]
MC Brown
Internal change only - no documentation needed.