Bug #34823 | fsync() occasionally returns ENOLCK and causes InnoDB to restart mysqld | ||
---|---|---|---|
Submitted: | 25 Feb 2008 23:52 | Modified: | 20 Jun 2010 17:15 |
Reporter: | Alex Rickabaugh | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S3 (Non-critical) |
Version: | 5.1.23-rc | OS: | FreeBSD |
Assigned to: | Vasil Dimov | CPU Architecture: | Any |
[25 Feb 2008 23:52]
Alex Rickabaugh
[25 Feb 2008 23:54]
Alex Rickabaugh
UNVERIFIED DO NOT USE ON PRODUCTION MAY CORRUPT DATA - Test patch to retry the fsync() in a loop on ENOLCK.
Attachment: enolck-retry-loop.patch (text/x-patch), 3.98 KiB.
[25 Feb 2008 23:59]
Alex Rickabaugh
I've attached a patch I made which modifies os_file_flush() to retry the fsync() in a loop on ENOLCK, indefinitely. DO NOT USE THIS PATCH ON PRODUCTION SYSTEMS, IT MAY CORRUPT YOUR DATA! I don't know enough about the inner workings of MySQL to tell if this approach runs the risk of corrupting the on-disk data files. That said, it does seem to be working well in our environment (with safeguards to backup the database files). Every so often the logs show an ENOLCK has occurred, but our site stays up. I am not posting this for inclusion into the MySQL source, it's merely intended to show how the problem can be worked-around by retrying the fsync() on ENOLCK.
[27 Feb 2008 12:48]
Heikki Tuuri
Alex, a problem in fsync() is that on some systems it does not flush the files to nonvolatile storage, though it should do so. We have traditionally recommended not running MySQL or InnoDB on an NFS, because there may be bugs in fsync() there, and also other file system bugs. I am assigning this bug report to Vasil. He should check if we can add the patch to InnoDB, preserving InnoDB's portability. Thank you, Heikki
[14 Mar 2008 8:07]
Vasil Dimov
Alex, you say: "We have also tested this with a local disk to eliminate NFS and ZFS as suspects." Were you using UFS2 in this case? Can you paste the line from `mount` output that corresponds to the filesystem where the MySQL/InnoDB files were located. Another thing: according to the fsync(2) man page http://www.freebsd.org/cgi/man.cgi?query=fsync&sektion=2&apropos=0&manpath=FreeBSD+7.0-sta... fsync() does not return ENOLCK (77), it could be a bug in the man page. You write in your patch: "The sages of the internet have said that code should just retry" Can you post the relevant URL? Thanks!
[14 Mar 2008 11:35]
Vasil Dimov
This FreeBSD problem report may be relevant: http://www.freebsd.org/cgi/query-pr.cgi?pr=86944
[19 Mar 2008 16:34]
Vasil Dimov
The solution is to retry operation, here is the patch.
Attachment: bug34823.diff (application/octet-stream, text), 2.10 KiB.
[31 Mar 2008 14:53]
Bugs System
Pushed into 5.1.24-rc
[1 Apr 2008 18:37]
Paul DuBois
Noted in 5.1.24 changelog. If fsync() returned ENOLCK, InnoDB could treat this as fatal and cause abnormal server termination. InnoDB now retries the operation. Resetting report to Patch queued waiting for push into 6.0.x.
[3 Apr 2008 13:02]
Bugs System
Pushed into 6.0.5-alpha
[7 Apr 2008 17:30]
Paul DuBois
Noted in 6.0.5 changelog.
[5 May 2010 15:10]
Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 14:07]
Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[28 May 2010 6:01]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:30]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:57]
Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 23:19]
Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[17 Jun 2010 12:04]
Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:48]
Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:31]
Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)