MySQL Bugs: #53708: InnoDB "not available", restarted MySQL server and can't get MySQL to start

Bug #53708	InnoDB "not available", restarted MySQL server and can't get MySQL to start
Submitted:	17 May 2010 15:18	Modified:	17 May 2010 19:33
Reporter:	Trevor Watson	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S1 (Critical)
Version:	5.0.22	OS:	Linux
Assigned to:		CPU Architecture:	Any
Tags:	5.0.22, can't start, innodb, MySQL

Description:
Over the weekend something happened to the MySQL Server.  I was attempting to create a new table and it told me that the InnoDB engine was not available (though most of the tables in the current schema are InnoDB).  We restarted the server and MySQL did not come back up.

Attempting to resolve the backtrace (using resolve_stack_dump) gives
------------------------------
0xffffe420 _end + -140409584
0xbfffb4cc _end + -1214163524
0x82fc515 btr_pcur_open_on_user_rec + 101
0x82b2b06 dict_load_indexes + 598
0x82b3bd3 dict_load_sys_table + 99
0x82a27cf dict_boot + 2991
0x82a10c3 innobase_start_or_create_for_mysql + 4627
0x8241a5f _Z13innobase_initv + 831
0x8232e2a _Z7ha_initv + 506
0x81869d5 main + 1653
0x400be250 _end + 934118208
0x80fd2d1 _start + 33
-----------------------------

the file /var/lib/mysql/linux.err contains

-----------------------------

100517 11:27:17  mysqld started
100517 11:27:17  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
100517 11:27:17  InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 3 3720965055.
InnoDB: Doing recovery: scanned up to log sequence number 3 3720965055
InnoDB: Page directory corruption: supremum not pointed to
100517 11:27:17  InnoDB: Page dump in ascii and hex (16384 bytes):

 len 16384; hex <lots of zeroes>

100517 11:27:18  InnoDB: Page checksum 1575996416, prior-to-4.0.14-form checksum 1371122432
InnoDB: stored checksum 0, prior-to-4.0.14-form stored checksum 0
InnoDB: Page lsn 0 0, low 4 bytes of lsn at page end 0
InnoDB: Page number (if stored to page already) 0,
InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 0
InnoDB: Page directory corruption: supremum not pointed to
100517 11:27:18  InnoDB: Page dump in ascii and hex (16384 bytes):

 len 16384; hex <lots of zeroes>

100517 11:27:18  InnoDB: Page checksum 1575996416, prior-to-4.0.14-form checksum 1371122432
InnoDB: stored checksum 0, prior-to-4.0.14-form stored checksum 0
InnoDB: Page lsn 0 0, low 4 bytes of lsn at page end 0
InnoDB: Page number (if stored to page already) 0,
InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 0
100517 11:27:18InnoDB: Error: trying to access a stray pointer 0xc03a7ff8
InnoDB: buf pool start is at 0x40398000, end at 0x40b98000
InnoDB: Probable reason is database corruption or memory
InnoDB: corruption. If this happens in an InnoDB database recovery,
InnoDB: you can look from section 6.1 at http://www.innodb.com/ibman.html
InnoDB: how to force recovery.
100517 11:27:18InnoDB: Assertion failure in thread 1075564672 in file ../include/buf0buf.ic line 259
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
InnoDB: about forcing recovery.
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=0
read_buffer_size=131072
max_used_connections=0
max_connections=100
threads_connected=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 217599 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...

Cannot determine thread, fp=0xbfffa60c, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x81842c0
0xffffe420
0xbfffb4cc
0x82fc515
0x82b2b06
0x82b3bd3
0x82a27cf
0x82a10c3
0x8241a5f
0x8232e2a
0x81869d5
0x400be250
0x80fd2d1
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
The manual page at http://www.mysql.com/doc/en/Crashing.html contains

information that should help you find out what is causing the crash.
100517 11:27:18  mysqld ended
-----------------------------

How to repeat:
This error occurs every time we attempt to start our MySQL server.   The only change we made recently to the database was to add the innodb_file_per_table and re-create the database so we could see the actual size of the data in the table.  However, it was running stably before this crash

[mysqld]
log-bin=mysql-bin.log
server-id=1
binlog-do-db=HomeCoder
innodb_file_per_table

Do you see anything suspicious in the OS logs during the time of initial crash? Your data are corrupted, and most likely reason is hardware problem or memory fault.

The log files for the server don't report anything strange (except for it attempting to mount a floppy disk at one point for some reason, gonna have to figure out why it did that).

However, the drives seem to be stable and no errors are reported in the log files or in the drive tools.  Running fsck on the drives reported no errors.

Is there a way to tell which files have been corrupted?  I have a backup of _most_ of the databases on this server, but don't want to reload the entire MySQL schema collection as some of them have tables with almost 1 million records.

Also, I attempted to recover the InnoDB databases by setting  "innodb_force_recovery = 4" in the my.cnf file, but to no avail.

after deleting (well renaming) ib_logfile0, ib_logfile1 and ibdata1 and restarting MySQL, it looks like everything got rebuilt successfully and the MySQL server is up and running again.   Going to do one more reboot and see if everything stays stable.

Thanks for your help.

Deleting and allowing the ibdata files to be rebuilt seems to have resolved the start-up issue.