Bug #9381 InnoDB unlocks data file too late (after PID file is removed)
Submitted: 24 Mar 2005 13:47 Modified: 8 Apr 2005 12:42
Reporter: Guilhem Bichot Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:4.1 OS:Linux (linux)
Assigned to: Marko Mäkelä CPU Architecture:Any

[24 Mar 2005 13:47] Guilhem Bichot
Description:
This is the probable reason for the recurrent failure of autobuild, like this one:
050322  0:47:25 [Note] /nfstmp1/guilhem/mysql-5.0-prod/sql/mysqld: ready for connections.
Version: '5.0.3-alpha-valgrind-max-debug-log'  socket: '/nfstmp1/guilhem/mysql-5.0-prod/mysql-test/var/tmp/master.sock'  por
t: 9306  Source distribution
050322  0:47:27 [Note] Got signal 15 to shutdown mysqld
050322  0:47:27 [Note] /nfstmp1/guilhem/mysql-5.0-prod/sql/mysqld: Normal shutdown

CURRENT_TEST: warnings
050322  0:47:28  InnoDB: Starting shutdown...
InnoDB: Unable to lock ./ibdata1, error: 11
InnoDB: Check that you do not already have another mysqld process
InnoDB: using the same InnoDB data or log files.
InnoDB: Error in opening ./ibdata1
050322  0:47:28  InnoDB: Operating system error number 11 in a file operation.
InnoDB: Error number 11 means 'Resource temporarily unavailable'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/mysql/en/Operating_System_error_codes.html
InnoDB: Could not open or create data files.
InnoDB: If you tried to add new data files, and it failed here,
InnoDB: you should now edit innodb_data_file_path in my.cnf back
InnoDB: to what it was, and remove the new ibdata files InnoDB created
InnoDB: in this failed attempt. InnoDB only wrote those files full of
InnoDB: zeros, but did not yet use them in any way. But be careful: do not
InnoDB: remove old data files which contain your precious data!
050322  0:47:28 [ERROR] Can't init databases
050322  0:47:28 [ERROR] Aborting

InnoDB locks datafile (ibdata1) when it starts (using fcntl()), and does not explicitely unlock it, which means it is unlocked when the mysqld process terminates, i.e. after the PID file is removed.
So if you send mysqld a shutdown request (kill -TERM or mysqladmin shutdown), wait for PID file to be gone, and restart mysqld immediately, there is a chance that the old mysqld is somewhere between removal of PID and disappearance, so this old mysqld would still have the lock, and the new mysqld would fail to lock. The machine which exhibits the problem has 2 CPUs and is Linux.
A solution is to modify innobase_end() to explicitely unlock the datafile before returning. The only question is can it safely do it, as there could be (up to the InnoDB masters to say) InnoDB threads which may not be dead yet when innobase_end() returns, so innobase_end() may not allow itself to unlock the datafile (maybe an InnoDB thread may want to write one last thing to it?).

How to repeat:
race condition, but quite frequent (once per week) on autobuild.
[4 Apr 2005 19:43] Heikki Tuuri
Marko has a patch pending for this.
[5 Apr 2005 13:47] Marko Mäkelä
Heikki has approved my patch, and I'll push it soon to 4.1 and 5.0.
This looks like a Linux bug, because InnoDB 4.1 and 5.0 do close all data files in logs_empty_and_mark_files_at_shutdown(). Maybe Linux does not tell the NFS server to release the locks immediately when the files are closed? (I saw that you might have had the data files on NFS, which is generally a bad idea.)
[5 Apr 2005 13:58] Guilhem Bichot
Thanks for working on it!
Yes I was on NFS, but I don't know if the nightly autobuild is too.
You say InnoDB "do close all data files" but close does not mean fnctl() (to unlock), does it?
[6 Apr 2005 12:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/23714
[6 Apr 2005 13:02] Marko Mäkelä
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Guilhem,
according to the fcntl() documentation, the lock should be released when the file is closed, so I didn't want to add fcntl() calls for unlocking. I've now pushed the fix to the 4.1 tree. I don't think it will make it to 5.0.4.
[7 Apr 2005 12:51] Guilhem Bichot
Hello Marko,
yes you are right, if it closes then it unlocks. Still I would like to be sure MySQL/InnoDB does nothing wrong before considering it's a bug in Linux. Please, in which line is "ibdata1" closed? Thanks!
[7 Apr 2005 12:59] Marko Mäkelä
Guilhem,
It's the fil_close_all_files() call in log0log.c, logs_empty_and_mark_files_at_shutdown(). The files will be left open in innobase_very_fast_shutdown. When debugging this, I added debug printout to os_file_open() and os_file_close().
[7 Apr 2005 20:32] Guilhem Bichot
Right, I verified with a test program that as soon as InnoDB closes ibdata1 (in the function you showed), the test program could fcntl the file. I will get info on what type of partition is the one used on the failing machine (does not look like NFS, though, from what 'mount' says). Thanks for the bugfix!
[8 Apr 2005 12:42] Guilhem Bichot
Confirmation: the partition where the problem happens some nights, is a local RAID array, not NFS.
[8 Apr 2005 14:00] Marko Mäkelä
Guilhem,
I think I may have got the error once when I was testing the fix on hundin. It appears to be a PCI-based SCSI RAID array (73 GB), a single reiserfs partition.