| Bug #30578 | Falcon: assertion in PageWriter::removeElement | ||
|---|---|---|---|
| Submitted: | 22 Aug 2007 20:42 | Modified: | 31 Aug 2007 12:13 |
| Reporter: | Calvin Sun | Email Updates: | |
| Status: | Closed | ||
| Category: | Server: Falcon | Severity: | S2 (Serious) |
| Version: | 6.0.2 | OS: | Any |
| Assigned to: | Kevin Lewis | Target Version: | |
[27 Aug 2007 7:16]
Hakan Küçükyılmaz
Verified with latest changeset
0x00002abf40b18fcb in raise () from /lib/libpthread.so.0
(gdb) bt
#0 0x00002abf40b18fcb in raise () from /lib/libpthread.so.0
#1 0x00000000008e7a88 in Error::debugBreak () at Error.cpp:92
#2 0x00000000008e7b9c in Error::error (string=0xc774a8 "assertion failed at line %d in file %s\n")
at Error.cpp:69
#3 0x00000000008e7c33 in Error::assertionFailed (fileName=0xc7c32d "PageWriter.cpp", line=238) at Error.cpp:76
#4 0x000000000091aa62 in PageWriter::removeElement (this=0x2abf461c46f8, element=0x2abf461d03e8)
at PageWriter.cpp:238
#5 0x000000000091b09b in PageWriter::writer (this=0x2abf461c46f8) at PageWriter.cpp:177
#6 0x000000000091b15b in PageWriter::writer (arg=0x2abf461c46f8) at PageWriter.cpp:158
#7 0x00000000008acf01 in Thread::thread (this=0x2abf4638d8f0) at Thread.cpp:161
#8 0x00000000008ad11b in Thread::thread (parameter=0x2abf4638d8f0) at Thread.cpp:140
#9 0x00002abf40b11317 in start_thread () from /lib/libpthread.so.0
#10 0x00002abf418e0b1d in clone () from /lib/libc.so.6
#11 0x0000000000000000 in ?? ()
[27 Aug 2007 7:36]
Hakan Küçükyılmaz
To repeat please use falcon_concurrent_blob_update in a loop with increasing seed value.
SEED=1
while ( [ $? -eq 0 ] );
do
./falcon_concurrent_blob_updates -r540 -s$SEED;
SEED=$(($SEED + 3));
done
[27 Aug 2007 14:58]
Kevin Lewis
Hakan reports that this latest assert came after running the test overnight, 'way more than 2 hours', on his 2-cpu laptop. So it still happens, but not yet easily reproducible.
[30 Aug 2007 4:26]
Kevin Lewis
The assertion in PageWriter::removeElement indicates that a DirtyPage element that should be in the DirtLinen hash table is no longer there; The boolean 'hit' is false. This happens because there is a gap in locking the PageWriter::syncObject between getting a pointer to the first element in PageWriter::writer and calling removeElement. During that gap, a foreground thread calls PageWriter::pageWritten and removes it instead. The fix is to hold an exclusive lock from the time an pointer to an element is read and the end of the call to removeElement()
[30 Aug 2007 9:46]
Hakan Küçükyılmaz
I ran falcon_concurrent_blob_update with increasing seed value over night for over 14 hours. No crash.
[30 Aug 2007 14:38]
Hakan Küçükyılmaz
For the documenter: Fixed in mysql-6.0.3-alpha.
[31 Aug 2007 12:13]
MC Brown
Internal fix only, no changelog entry required.

Description: Assertion failure reported by Christoffer Hall: Program received signal SIGILL, Illegal instruction. [Switching to Thread -1309652080 (LWP 8358)] 0xffffe410 in __kernel_vsyscall () (gdb) bt #0 0xffffe410 in __kernel_vsyscall () #1 0xb7e694f1 in raise () from /lib/tls/i686/cmov/libpthread.so.0 #2 0x084817c4 in Error::error ( string=0x87e5f74 "assertion failed at line %d in file %s\n") at Error.cpp:92 #3 0x08481830 in Error::assertionFailed (fileName=0x87e9ca5 "PageWriter.cpp", line=237) at Error.cpp:76 #4 0x084a70f4 in PageWriter::removeElement (this=0xb7096c80, element=0xb72ddcf0) at PageWriter.cpp:237 #5 0x084a733e in PageWriter::writer (this=0xb7096c80) at PageWriter.cpp:177 #6 0x0844faf1 in Thread::thread (this=0xb709ca30) at Thread.cpp:161 #7 0x0844fcd2 in Thread::thread (parameter=0xb709ca30) at Thread.cpp:140 #8 0xb7e6131b in start_thread () from /lib/tls/i686/cmov/libpthread.so.0 #9 0xb7d6857e in clone () from /lib/tls/i686/cmov/libc.so.6 (gdb) How to repeat: Hakan got it running Falcon Concurrent BLOB Update after 83 minutes of looping 9 minutes runs with increasing seed value. Suggested fix: no assertion failure