Bug #45892 Innodb calls fsync for writes with innodb_flush_method=O_DIRECT
Submitted: 2 Jul 2009 0:24 Modified: 12 Mar 2013 22:52
Reporter: Mark Callaghan Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:5.0,5.1 OS:Any
Assigned to: Inaam Rana CPU Architecture:Any
Tags: fsync, innodb

[2 Jul 2009 0:24] Mark Callaghan
InnoDB calls fsync after writes to the datafile when innodb_flush_method=O_DIRECT. These are not needed. I think a mutex is held by InnoDB when this is done so it may cause more contention.

How to repeat:
Read http://mysqlha.blogspot.com/2009/06/buffered-versus-direct-io-for-innodb.html
[2 Jul 2009 7:51] Sveta Smirnova
Thank you for the report.

Verified as described using strace -f.
[2 Jul 2009 15:58] MySQL Verification Team
Mark, Inaam,

This was a code that Heikki has left on purpose. He found out that there are operating systems that still require fsync() call, although they support O_DIRECT calls.

I have stress tested O_DIRECT on many newer versions on Linux, and InnoDB managed to recover very successfully with all types of crashes or power-offs.

Operating systems that are problematic are some exotic versions, although I do not remember which exactly. I think that Heikki definitely remembers them.
[30 Mar 2010 14:56] Harrison Fisk
Some testing by Domas has shown that some filesystems (XFS) do not sync metadata without the fsync.  If the metadata would change, then you need to still use fsync (or O_SYNC for file open).  

For example, if a file grows while O_DIRECT is enabled it will still write to the new part of the file, however since the metadata doesn't reflect the new size of the file the tail portion can be lost in the event of a crash.


Continue to use fsync when important metadata changes or use O_SYNC in addition to O_DIRECT.
[19 Feb 2013 0:00] John Russell
Added option and associated wording to:


(Might take a day or so to appear.)  Might still tweak the wording based on discussions with dev team, but in any case closing the bug now.
[19 Feb 2013 0:04] John Russell
Added to changelog for 5.6.7: 

A new setting O_DIRECT_NO_FSYNC was added to the innodb_flush_method
configuration option. This setting is similar to O_DIRECT, but omits
the subsequent fsync() call. Suitable for some filesystems but not
[4 Mar 2013 15:27] Mark Callaghan
I don't think this has been fixed given the warning that it won't work for XFS.


An alternative setting is O_DIRECT_NO_FSYNC: it uses the O_DIRECT flag during flushing I/O, but skips the fsync() system call afterwards. This setting is suitable for some types of filesystems but not others. For example, it is not suitable for XFS. If you are not sure whether the filesystem you use requires an fsync(), for example to preserve all file metadata, use O_DIRECT instead.
[5 Mar 2013 18:34] Mark Callaghan
More perf results that show the benefit of using O_DIRECT_NO_FSYNC
[6 Mar 2013 22:04] Mark Callaghan
Removing the call to sleep in fil_flush would make this much less of a big deal
[12 Mar 2013 22:52] Mark Callaghan
Making fil_flush() more efficient would help, but not as much as removing the fsync calls: