MySQL Bugs: #13513: InnoDB crash recovery fails after power failure (on large databases only?)

Bug #13513	InnoDB crash recovery fails after power failure (on large databases only?)
Submitted:	27 Sep 2005 2:37	Modified:	27 Sep 2005 6:52
Reporter:	Ian Kilgore	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Server	Severity:	S3 (Non-critical)
Version:	4.1.11a-4	OS:	Linux (Debian (sarge))
Assigned to:		CPU Architecture:	Any

Description:
After a power failure, InnoDB's crash recovery consistently fails to get the database is a usable state.  mysqld either hangs, or segfaults. This has been tested using reiserfs, ext3, and a raw device for the tablespace.  Also tried with binlogs on, and off.  mysql 4.0 works as expected (crash recovery runs, database starts up in a usable state after power failure).

Example from syslog follows:

Sep  8 14:01:08 localhost mysqld_safe[2592]: started
Sep  8 14:01:08 localhost mysqld[2595]: 050908 14:01:08  InnoDB:
Database was not shut down normally!
Sep  8 14:01:08 localhost mysqld[2595]: InnoDB: Starting crash recovery.
Sep  8 14:01:08 localhost mysqld[2595]: InnoDB: Reading tablespace
information from the .ibd files...
Sep  8 14:01:08 localhost mysqld[2595]: InnoDB: Restoring possible
half-written data pages from the doublewrite
Sep  8 14:01:08 localhost mysqld[2595]: InnoDB: buffer...
Sep  8 14:01:08 localhost mysqld[2595]: 050908 14:01:08  InnoDB:
Starting log scan based on checkpoint at
Sep  8 14:01:08 localhost mysqld[2595]: InnoDB: log sequence number 0
683773405.
Sep  8 14:01:08 localhost mysqld[2595]: InnoDB: Doing recovery: scanned
up to log sequence number 0 689015808
Sep  8 14:01:08 localhost mysqld[2595]: InnoDB: Doing recovery: scanned
up to log sequence number 0 689688576
Sep  8 14:01:09 localhost mysqld[2595]: InnoDB: 1 transaction(s) which
must be rolled back or cleaned up
Sep  8 14:01:09 localhost mysqld[2595]: InnoDB: in total 3225790 row
operations to undo
Sep  8 14:01:09 localhost mysqld[2595]: InnoDB: Trx id counter is 0 1792
Sep  8 14:01:09 localhost mysqld[2595]: 050908 14:01:09  InnoDB:
Starting an apply batch of log records to the database...
Sep  8 14:01:09 localhost mysqld[2595]: InnoDB: Progress in percents: 0
1 2 3 050908 14:01:09  InnoDB: Error: page 42497 log sequence number 0
690226205
Sep  8 14:01:09 localhost mysqld[2595]: InnoDB: is in the future!
Current system log sequence number 0 689688483.
Sep  8 14:01:09 localhost mysqld[2595]: InnoDB: Your database may be
corrupt.
Sep  8 14:01:09 localhost mysqld[2595]: 050908 14:01:09  InnoDB: Error:
page 42498 log sequence number 0 690709217
Sep  8 14:01:09 localhost mysqld[2595]: InnoDB: is in the future!
Current system log sequence number 0 689688483.
Sep  8 14:01:09 localhost mysqld[2595]: InnoDB: Your database may be
corrupt.
Sep  8 14:01:09 localhost mysqld[2595]: 050908 14:01:09  InnoDB: Error:
page 42499 log sequence number 0 690683405
Sep  8 14:01:09 localhost mysqld[2595]: InnoDB: is in the future!
Current system log sequence number 0 689688483.
Sep  8 14:01:09 localhost mysqld[2595]: InnoDB: Your database may be
corrupt.

[ ... ]

Sep  8 14:01:23 localhost mysqld[2595]: 98 99 
Sep  8 14:01:23 localhost mysqld[2595]: InnoDB: Apply batch completed
Sep  8 14:01:23 localhost mysqld[2595]: InnoDB: Starting rollback of
uncommitted transactions
Sep  8 14:01:23 localhost mysqld[2595]: InnoDB: Rolling back trx with id
0 1393, 3225790 rows to undo
Sep  8 14:01:23 localhost mysqld[2595]: InnoDB: Progress in percents:
1050908 14:01:23  InnoDB: Error: page 15398 log sequence number 0
690059220
Sep  8 14:01:23 localhost mysqld[2595]: InnoDB: is in the future!
Current system log sequence number 0 689735039.
Sep  8 14:01:23 localhost mysqld[2595]: InnoDB: Your database may be
corrupt.
Sep  8 14:01:34 localhost mysqld[2595]:  2mysqld got signal 11;
Sep  8 14:01:34 localhost mysqld[2595]: This could be because you hit a
bug. It is also possible that this binary
Sep  8 14:01:34 localhost mysqld[2595]: or one of the libraries it was
linked against is corrupt, improperly built,
Sep  8 14:01:34 localhost mysqld[2595]: or misconfigured. This error can
also be caused by malfunctioning hardware.
Sep  8 14:01:34 localhost mysqld[2595]: We will try our best to scrape
up some info that will hopefully help diagnose
Sep  8 14:01:34 localhost mysqld[2595]: the problem, but since we have
already crashed, something is definitely wrong
Sep  8 14:01:34 localhost mysqld[2595]: and this may fail.
Sep  8 14:01:34 localhost mysqld[2595]: 
Sep  8 14:01:34 localhost mysqld[2595]: key_buffer_size=0
Sep  8 14:01:34 localhost mysqld[2595]: read_buffer_size=131072
Sep  8 14:01:34 localhost mysqld[2595]: max_used_connections=0
Sep  8 14:01:34 localhost mysqld[2595]: max_connections=100
Sep  8 14:01:34 localhost mysqld[2595]: threads_connected=0
Sep  8 14:01:34 localhost mysqld[2595]: It is possible that mysqld could
use up to 
Sep  8 14:01:34 localhost mysqld[2595]: key_buffer_size +
(read_buffer_size + sort_buffer_size)*max_connections = 217599 K
Sep  8 14:01:34 localhost mysqld[2595]: bytes of memory
Sep  8 14:01:34 localhost mysqld[2595]: Hope that's ok; if not, decrease
some variables in the equation.
Sep  8 14:01:34 localhost mysqld[2595]: 
Sep  8 14:01:34 localhost mysqld[2595]: thd=(nil)
Sep  8 14:01:34 localhost mysqld[2595]: Attempting backtrace. You can
use the following information to find out
Sep  8 14:01:34 localhost mysqld[2595]: where mysqld died. If you see no
messages after this, something went
Sep  8 14:01:34 localhost mysqld[2595]: terribly wrong...
Sep  8 14:01:34 localhost mysqld[2595]: Cannot determine thread,
fp=0xbfffd8ec, backtrace may not be correct.
Sep  8 14:01:34 localhost mysqld[2595]: Stack range sanity check OK,
backtrace follows:
Sep  8 14:01:34 localhost mysqld[2595]: 0x81893bf
Sep  8 14:01:34 localhost mysqld[2595]: 0xffffe420
Sep  8 14:01:34 localhost mysqld[2595]: 0xbfffdda0
Sep  8 14:01:34 localhost mysqld[2595]: 0x83505d9
Sep  8 14:01:34 localhost mysqld[2595]: 0x8363b88
Sep  8 14:01:34 localhost mysqld[2595]: 0x8363fd5
Sep  8 14:01:34 localhost mysqld[2595]: 0x834166a
Sep  8 14:01:34 localhost mysqld[2595]: 0x8340ba4
Sep  8 14:01:34 localhost mysqld[2595]: 0x833b11c
Sep  8 14:01:34 localhost mysqld[2595]: 0x833a9b7
Sep  8 14:01:34 localhost mysqld[2595]: 0x8312001
Sep  8 14:01:34 localhost mysqld[2595]: 0x831125c
Sep  8 14:01:34 localhost mysqld[2595]: 0x837dc39
Sep  8 14:01:34 localhost mysqld[2595]: 0x83ac6e0
Sep  8 14:01:34 localhost mysqld[2595]: 0x82f3156
Sep  8 14:01:34 localhost mysqld[2595]: 0x821bb0e
Sep  8 14:01:34 localhost mysqld[2595]: 0x820cee5
Sep  8 14:01:34 localhost mysqld[2595]: 0x818cc15
Sep  8 14:01:34 localhost mysqld[2595]: 0x8189ab5
Sep  8 14:01:34 localhost mysqld[2595]: 0x48115970
Sep  8 14:01:34 localhost mysqld[2595]: 0x810cef1

Here is the my.cnf:

[client]
port            = 3306
socket          = /var/run/mysqld/mysqld.sock

[mysqld_safe]
socket          = /var/run/mysqld/mysqld.sock
nice            = 0

[mysqld]
#
# * Basic Settings
#
user            = mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
port            = 3306
basedir         = /usr
datadir         = /var/lib/mysql
tmpdir          = /tmp
language        = /usr/share/mysql/english
skip-external-locking

bind-address            = 127.0.0.1
#
# * Fine Tuning
#
key_buffer              = 16M
max_allowed_packet      = 16M

thread_stack            = 128K
#
# * Query Cache Configuration
#
query_cache_limit       = 1048576
query_cache_size        = 16777216
query_cache_type        = 1

log-bin                 = /var/log/mysql/mysql-bin.log

max_binlog_size         = 104857600

skip-bdb

[mysqldump]
quick
quote-names
max_allowed_packet      = 16M

[mysql]
#no-auto-rehash # faster start of mysql but no tab completition

[isamchk]
key_buffer              = 16M

How to repeat:
Start a large import, and pull the plug.  Boot the machine, mysql will take a long time trying to start up, then fail.

Suggested fix:
Mysql 4.0 seems to handle this correctly

Hi!

The printout below shows that fsync() did not flush the InnoDB log file to a durable media like it should.

1 2 3 050908 14:01:09 InnoDB: Error: page 42497 log sequence number 0
690226205
Sep 8 14:01:09 localhost mysqld[2595]: InnoDB: is in the future!
Current system log sequence number 0 689688483.
Sep 8 14:01:09 localhost mysqld[2595]: InnoDB: Your database may be
corrupt.

http://dev.mysql.com/doc/mysql/en/innodb-configuration.html
"
Note: InnoDB provides MySQL with a transaction-safe (ACID compliant) storage engine with commit, rollback, and crash recovery capabilities. It cannot do so if the underlying operating system and hardware does not work as advertised. Many operating systems or disk subsystems may delay or reorder write operations in order to improve performance. On some operating systems, the very system call (fsync()) that should wait until all unwritten data for a file has been flushed may actually return before the data has been flushed to stable storage. Because of this, an operating system crash or a power outage may destroy recently committed data, or in the worst case, even corrupt the database because of write operations having been reordered. If data integrity is important to you, you should perform some “pull-the-plug” tests before using anything in production. On Mac OS X 10.3 and later, InnoDB uses a special fcntl() file flush method. Under Linux, it is advisable to disable the write-back cache.

On ATAPI hard disks, a command like hdparm -W0 /dev/hda may work. Beware that some drives or disk controllers may be unable to disable the write-back cache.
"

The fact that crash recovery in 4.0 seems to work may be due to some random effect. In general, crash recovery should not work in any InnoDB version if fsync() is not reliable in the system.

Regards,

Heikki