Bug #38970 | Crash in function called from falcon_init when running test cases | ||
---|---|---|---|
Submitted: | 22 Aug 2008 17:45 | Modified: | 15 May 2009 12:52 |
Reporter: | Sven Sandberg | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S1 (Critical) |
Version: | 6.0-rpl | OS: | Any |
Assigned to: | Vladislav Vaintroub | CPU Architecture: | Any |
Tags: | 6.0-rpl-green, core, crash, F_ERROR HANDLING, replication, test failure |
[22 Aug 2008 17:45]
Sven Sandberg
[22 Aug 2008 17:47]
Sven Sandberg
list of stack traces for five crashes
Attachment: stack-traces (application/octet-stream, text), 31.18 KiB.
[22 Aug 2008 18:11]
Sven Sandberg
Correction: the coredumps did not only happen in the startup of slave servers, it happened also in the startup of master servers.
[25 Aug 2008 2:11]
Kevin Lewis
This does not look like a Falcon issue as likely as a disk or IO system issue. The errors are either IO::writePages,IO.cpp:338; "write error on page %d (%d/%d/%d) of \"%s\": %s (%d)") which sets a fatal error flag, after which the other type of error will occur to other threads; IO::writePages, IO.cpp:308, "can't continue after fatal error". Since the error is not consistent, my guess is that it is not in the way the file is opened.
[25 Aug 2008 17:12]
Vladislav Vaintroub
Found this in the stacktraces... [Falcon] Error: write error on page 0 (4096/4096/4) of "/home/sven/bzr/merge/6.0-rpl_from_5.1-rpl/mysql-test/var/2/mysqld.2/data/falcon_master.fts": Input/output error (5) So, pwrite is getting IO error- errno 5. I've no idea how can this be. wild guess is that file descriptor 5 was opened by falcon , then closed by somebody else and we're doing pwrite on socket or similar
[25 Aug 2008 17:13]
Vladislav Vaintroub
file descriptor 5 should really befile descriptor 4 in previous comment.
[13 Nov 2008 18:52]
Sven Sandberg
Note that only crash 3 and crash 4 contain the text "Error: write error on page 0 (4096/4096/4)..." Crash 1, 2, and 5 instead contain the text "can't continue after fatal error" in the stack trace. I just reproduced crash 1/2/5 on my local machine in 6.0-rpl, which has BUG#39458 fixed. So this is not a symptom of BUG#39458.
[13 Nov 2008 18:56]
Sven Sandberg
stack trace from crash 6
Attachment: stacktrace (application/octet-stream, text), 7.86 KiB.
[13 Nov 2008 20:22]
Kevin Lewis
Sven, I assume that you now consider this bug verified. Has it happened again since Aug 25? Assigning this to Vlad to look into. If this is a one time problem we may have to make it Can't Repeat.
[14 Nov 2008 8:45]
Sven Sandberg
Kevin, yes crash#6 happened yesterday with a 6.0-rpl tree. With 6.0-rpl it usually happens at least a couple of times each time I run the suite. Let me know if I shall try to repeat it with 6.0 main.
[14 Nov 2008 11:24]
Vladislav Vaintroub
Sven, please try to reproduce with main. it is not a usual crash and looks very like as if somebody (pointing to rpl ;)) closes file descriptors that belong to falcon.
[18 Dec 2008 12:33]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/61964 2945 Vladislav Vaintroub 2008-12-18 Bug #38970 Crash in function called from falcon_init when running test cases Problem: Upon encountering IO errors, falcon crashes with assert. Solution:Instead of assert, throw an exception. This allows to more graceful error handling during Falcon startup .Error text will be written into error log and Falcon will not load.
[18 Dec 2008 13:04]
Vladislav Vaintroub
Pushed into falcon-team
[13 Feb 2009 7:25]
Bugs System
Pushed into 6.0.10-alpha (revid:alik@sun.com-20090211182317-uagkyj01fk30p1f8) (version source revid:hky@sun.com-20081218223730-ujuygclo2fezfurq) (merge vers: 6.0.9-alpha) (pib:6)
[15 May 2009 12:52]
MC Brown
A note has been added to the 6.0.10 changelog: When the Falcon storage engine encountered an I/O error, mysqld would crash. Errors now raise an exception, which is reported to the error log and Falcon will fail to initialize.