Bug #61667 MySQL/InnoDB overwrites ibdata files after NFS (un/re)mount of datadir
Submitted: 28 Jun 2011 9:43 Modified: 28 Jun 2011 13:31
Reporter: David Gabriel Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.1, 5.5 OS:Linux
Assigned to: CPU Architecture:Any
Tags: nfs innodb

[28 Jun 2011 9:43] David Gabriel
Description:
Ubuntu 10.04 LTS AMD64
Distro Packages:
mysql  Ver 14.14 Distrib 5.1.41, for debian-linux-gnu (x86_64) using readline 6.1

Note: this error is reproducible with MySQL 5.5 too

MySQL on NFS overwrites the ibdata files if the datadir is mounted via nfs. The following seems to be going on when innodb starts up when running with strace:

open("/var/lib/mysql/ibdata1", O_RDWR|O_CREAT|O_EXCL, 0660)

This call should return '-1' if the file exists, otherwise the file is created. However innodb seems to ignore the already existing ibdata files and overwrite them:

/var/log/mysql/mysql.log
InnoDB: The first specified data file ./ibdata1 did not exist:
InnoDB: a new database to be created!
110624 15:54:35 InnoDB: Setting file ./ibdata1 size to 10 MB
InnoDB: Database physically writes the file full: wait...
110624 15:54:35 InnoDB: Log file ./ib_logfile0 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile0 size to 64 MB
InnoDB: Database physically writes the file full: wait...
110624 15:54:36 InnoDB: Log file ./ib_logfile1 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile1 size to 64 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: Creating foreign key constraint system tables
InnoDB: Foreign key constraint system tables created
110624 15:54:37 InnoDB: Started; log sequence number 0 0
110624 15:54:37 [Note] Event Scheduler: Loaded 0 events
110624 15:54:37 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.1.41-3ubuntu12.10-log' socket: '/var/run/mysqld/mysqld.sock' port: 3306 (Ubuntu)

which in turn renders any innod database unoperatable, producing errors like:

110624 15:54:38 [ERROR] Cannot find or open table database/tablr from
the internal data dictionary of InnoDB though the .frm file for the
table exists. [...]

This error can be reproduced on NetApp (wafl) , Solaris (zfs) or Linux NFS (ext3) Servers exporting the datadir so it does not seem to be vendor related.

Also a wealth of NFS and InnoDB options was tested (locking) with no change in behaviour.

How to repeat:
 - export a nfs share to the database server
 - mount the share on the databse server
 - create mysql datadir on the share
 - create a databse with the innodb engine on the datadir
 - perform random query to test functionality
 - stop mysql
 - umount datadir (or reboot server)
 - mount datadir
 - perform random query to test functionality -> innodb does not work anymore, ibd* files have been overwritten

Suggested fix:
the system call 'O_CREAT|O_EXCL' does not seem to work reliably over NFS:

http://lwn.net/Articles/251004/

Perhaps there is another way to test for existing files that works reliable on NFS too, or especially test for NFS datadirs?
[28 Jun 2011 10:04] Davi Arnaut
The man page says:

"On NFS, O_EXCL is only supported when using NFSv3 or later on kernel 2.6 or later. In NFS environments where O_EXCL support is not provided, programs that rely on it for performing locking tasks will contain a race condition."

Note that the described problem is a race condition, in order for this to be a problem there would have to be some other MySQL instance (or some other program) trying to create that file.

If this is not the case, this looks like a NFS implementation bug like http://lkml.org/lkml/2011/1/12/195
[28 Jun 2011 10:08] David Gabriel
Thank you for the link, I might have hit a linux regression here. Will test with another kernel and report.
[28 Jun 2011 13:31] David Gabriel
Sorry to have bothered you, this seems to be a linux issue which is fixed in 2.6.38+.