Bug #34823 fsync() occasionally returns ENOLCK and causes InnoDB to restart mysqld
Submitted: 26 Feb 2008 0:52 Modified: 7 Apr 2008 19:30
Reporter: Alex Rickabaugh
Status: Closed
Category:Server: InnoDB Severity:S3 (Non-critical)
Version:5.1.23-rc OS:FreeBSD
Assigned to: Vasil Dimov Target Version:
Triage: D2 (Serious)

[26 Feb 2008 0:52] Alex Rickabaugh
Description:
Running MySQL 5.1.23-rc on 8-core FreeBSD 7.0-RC2 (2 quad core amd64 processors), 16 GB
RAM, hosting a database for a large community website.

Server is booted disklessly and mounts / over NFS. The NFS share is a ZFS filesystem
running on another FreeBSD server. We have also tested this with a local disk to
eliminate NFS and ZFS as suspects. The problem still occurs with MySQL data files on a
local disk, albeit less frequently.

What I've been able to gather is:

In InnoDB's os_file_flush(), InnoDB calls fsync(), and FreeBSD returns an ENOLCK error.
Any error in os_file_flush() is treated as fatal, and MySQL restarts with the following
log messages:

InnoDB: Error: the OS said file flush did not succeed
InnoDB: Operating system error number 77 in a file operation.
InnoDB: Error number 77 means 'No locks available'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/operating-system-error-codes.html
InnoDB: File operation call: 'flush'.
InnoDB: Cannot continue operation.
mysqld_safe mysqld restarted

How to repeat:
Problem occurs intermittently without a (known) trigger to reproduce it. It does seem to
occur every few days. As mentioned in the description, moving the MySQL data files to a
local disk did seem to reduce the frequency, though the problem still occurred.

Suggested fix:
I doubt this is a bug in MySQL or the operating system. I'm submitting it here in the
hopes that MySQL's handling of ENOLCK can be made more robust (like attempting to retry
the fsync() on an ENOLCK), or even if someone more familiar with the ENOLCK condition
knows which settings in FreeBSD to tweak to avoid it.
[26 Feb 2008 0:54] Alex Rickabaugh
UNVERIFIED DO NOT USE ON PRODUCTION MAY CORRUPT DATA - Test patch to retry the fsync() in
a loop on ENOLCK.

Attachment: enolck-retry-loop.patch (text/x-patch), 3.98 KiB.

[26 Feb 2008 0:59] Alex Rickabaugh
I've attached a patch I made which modifies os_file_flush() to retry the fsync() in a loop
on ENOLCK, indefinitely.

DO NOT USE THIS PATCH ON PRODUCTION SYSTEMS, IT MAY CORRUPT YOUR DATA! I don't know
enough about the inner workings of MySQL to tell if this approach runs the risk of
corrupting the on-disk data files.

That said, it does seem to be working well in our environment (with safeguards to backup
the database files). Every so often the logs show an ENOLCK has occurred, but our site
stays up.

I am not posting this for inclusion into the MySQL source, it's merely intended to show
how the problem can be worked-around by retrying the fsync() on ENOLCK.
[27 Feb 2008 13:48] Heikki Tuuri
Alex,

a problem in fsync() is that on some systems it does not flush the files to nonvolatile
storage, though it should do so.

We have traditionally recommended not running MySQL or InnoDB on an NFS, because there
may be bugs in fsync() there, and also other file system bugs.

I am assigning this bug report to Vasil. He should check if we can add the patch to
InnoDB, preserving InnoDB's portability.

Thank you,

Heikki
[14 Mar 2008 9:07] Vasil Dimov
Alex, you say:

"We have also tested this with a local disk to eliminate
NFS and ZFS as suspects."

Were you using UFS2 in this case? Can you paste the line from `mount` output that 
corresponds to the filesystem where the MySQL/InnoDB files were located.

Another thing: according to the fsync(2) man page
http://www.freebsd.org/cgi/man.cgi?query=fsync&sektion=2&apropos=0&manpath=FreeBSD+7.0-sta...
fsync() does not return ENOLCK (77), it could be a bug in the man page. You write in your
patch:

"The sages of the internet have said that code should just retry"

Can you post the relevant URL?

Thanks!
[14 Mar 2008 12:35] Vasil Dimov
This FreeBSD problem report may be relevant:

http://www.freebsd.org/cgi/query-pr.cgi?pr=86944
[19 Mar 2008 17:34] Vasil Dimov
The solution is to retry operation, here is the patch.

Attachment: bug34823.diff (application/octet-stream, text), 2.10 KiB.

[31 Mar 2008 16:53] Bugs System
Pushed into 5.1.24-rc
[1 Apr 2008 20:37] Paul DuBois
Noted in 5.1.24 changelog.

If fsync() returned ENOLCK, InnoDB could treat this as fatal and
cause abnormal server termination. InnoDB now retries the operation.

Resetting report to Patch queued waiting for push into 6.0.x.
[3 Apr 2008 15:02] Bugs System
Pushed into 6.0.5-alpha
[7 Apr 2008 19:30] Paul DuBois
Noted in 6.0.5 changelog.