| Bug #34823 | fsync() occasionally returns ENOLCK and causes InnoDB to restart mysqld | ||
|---|---|---|---|
| Submitted: | 26 Feb 2008 0:52 | Modified: | 7 Apr 2008 19:30 |
| Reporter: | Alex Rickabaugh | ||
| Status: | Closed | ||
| Category: | Server: InnoDB | Severity: | S3 (Non-critical) |
| Version: | 5.1.23-rc | OS: | FreeBSD |
| Assigned to: | Vasil Dimov | Target Version: | |
| Triage: | D2 (Serious) | ||
[26 Feb 2008 0:52]
Alex Rickabaugh
[26 Feb 2008 0:54]
Alex Rickabaugh
UNVERIFIED DO NOT USE ON PRODUCTION MAY CORRUPT DATA - Test patch to retry the fsync() in a loop on ENOLCK.
Attachment: enolck-retry-loop.patch (text/x-patch), 3.98 KiB.
[26 Feb 2008 0:59]
Alex Rickabaugh
I've attached a patch I made which modifies os_file_flush() to retry the fsync() in a loop on ENOLCK, indefinitely. DO NOT USE THIS PATCH ON PRODUCTION SYSTEMS, IT MAY CORRUPT YOUR DATA! I don't know enough about the inner workings of MySQL to tell if this approach runs the risk of corrupting the on-disk data files. That said, it does seem to be working well in our environment (with safeguards to backup the database files). Every so often the logs show an ENOLCK has occurred, but our site stays up. I am not posting this for inclusion into the MySQL source, it's merely intended to show how the problem can be worked-around by retrying the fsync() on ENOLCK.
[27 Feb 2008 13:48]
Heikki Tuuri
Alex, a problem in fsync() is that on some systems it does not flush the files to nonvolatile storage, though it should do so. We have traditionally recommended not running MySQL or InnoDB on an NFS, because there may be bugs in fsync() there, and also other file system bugs. I am assigning this bug report to Vasil. He should check if we can add the patch to InnoDB, preserving InnoDB's portability. Thank you, Heikki
[14 Mar 2008 9:07]
Vasil Dimov
Alex, you say: "We have also tested this with a local disk to eliminate NFS and ZFS as suspects." Were you using UFS2 in this case? Can you paste the line from `mount` output that corresponds to the filesystem where the MySQL/InnoDB files were located. Another thing: according to the fsync(2) man page http://www.freebsd.org/cgi/man.cgi?query=fsync&sektion=2&apropos=0&manpath=FreeBSD+7.0-sta... fsync() does not return ENOLCK (77), it could be a bug in the man page. You write in your patch: "The sages of the internet have said that code should just retry" Can you post the relevant URL? Thanks!
[14 Mar 2008 12:35]
Vasil Dimov
This FreeBSD problem report may be relevant: http://www.freebsd.org/cgi/query-pr.cgi?pr=86944
[19 Mar 2008 17:34]
Vasil Dimov
The solution is to retry operation, here is the patch.
Attachment: bug34823.diff (application/octet-stream, text), 2.10 KiB.
[31 Mar 2008 16:53]
Bugs System
Pushed into 5.1.24-rc
[1 Apr 2008 20:37]
Paul DuBois
Noted in 5.1.24 changelog. If fsync() returned ENOLCK, InnoDB could treat this as fatal and cause abnormal server termination. InnoDB now retries the operation. Resetting report to Patch queued waiting for push into 6.0.x.
[3 Apr 2008 15:02]
Bugs System
Pushed into 6.0.5-alpha
[7 Apr 2008 19:30]
Paul DuBois
Noted in 6.0.5 changelog.
