MySQL Bugs: #16827: Better InnoDB error message if ibdata files omitted from my.cnf

Bug #16827	Better InnoDB error message if ibdata files omitted from my.cnf
Submitted:	27 Jan 2006 5:53	Modified:	19 Jun 2010 17:44
Reporter:	MICHAEL LOFTIS	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S2 (Serious)
Version:	4.1.11	OS:	Linux (Debian 3.0 Woody)
Assigned to:		CPU Architecture:	Any

Description:
Startup, irregardless of innodb_force_recovery value results in stack trace.  Resolved trace follows.

root@db0:/var/lib/mysql# resolve_stack_dump -s /tmp/mysqld.sym -n /tmp/mysqld.stack 
InnoDB: Resetting space id's in the doublewrite buffer
InnoDB: Error: trying to access page number 26141 in space 0,
InnoDB: space name ./ibdata1,
InnoDB: which is outside the tablespace bounds.
InnoDB: Byte offset 0, len 16384, i/o type 10
InnoDB: Assertion failure in thread 1024 in file fil0fil.c line 3813
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
InnoDB: about forcing recovery.
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=0
read_buffer_size=131072
max_used_connections=0
max_connections=100
threads_connected=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 217599 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xbfffcc38, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x8152451 handle_segfault + 425
0x4004af54 _end + 932968900
0x83885f9 fil_io + 721
0x8358397 buf_read_page_low + 487
0x83592ea buf_read_page + 70
0x8345c03 buf_page_get_gen + 987
0x8340160 trx_undo_mem_create_at_db_start + 200
0x8340bce trx_undo_lists_init + 274
0x833491c trx_rseg_mem_create + 588
0x8334d0e trx_rseg_list_and_array_init + 142
0x833676d trx_sys_init_at_db_start + 837
0x829777e innobase_start_or_create_for_mysql + 2710
0x81d5694 innobase_init__Fv + 1108
0x81c8859 ha_init__Fv + 65
0x81535bd init_server_components__Fv + 1305
0x8153a4c main + 728
0x4011a14f _end + 933817279
0x80f31a1 _start + 33
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do 
resolve it
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.
J
root@db0:/var/lib/mysql# 

How to repeat:
InnoDB shut down cleanly from 4.0.24 and was upgraded to 4.1.11, but 4.1.11 refuses to start.  Part of the cause is the particular packages overwrote my my.cnf file removing the innodb configuration we have.  InnoDB handled/handles this in a REALLY bad way, refusing to start with the above cryptic error message.

Suggested fix:
I'd like to see it handle it better, less cryptic error message, or figure out that extents were 'MIA' and possibly be able to suggest that, or be asked to continue without them if possible (data loss guaranteed obviously)

Michael,

the error means serious corruption of the InnoDB tablespace. It is trying to initialize the undo log lists at the server startup, but a pointer points outside of the tablespace.

Please post the COMPLETE UNEDITED .err log from the server, both before and after the upgrade. Please explain in detail how did you make the upgrade.

I have never before seen this error in an upgrade.

Regards,

Heikki

That's it, that's the whole error log.  What happened was that for some reason the particular package I was using silently overwrote the my.cnf file.  It then didn't have the complete list of data files that it was supposed to use.  When it started up it was totally unable to detect this condition and instead of giving an error message that might help one to quickly diagnose the issue, it gave that rather worrying, and unhelpful message instead.

I'll be contacting the package maintainers once I figure out why it overwrote the my.cnf file which it should never do in Debian, atleast not w/o leaving a .dpkg-old file which it didn't do, that tells me it's broken.

The snippet I gave is *all* that it logs.  I can try to go back to before the upgrade but the shutdown was error free and clean, and the issue really is that when an ibdata area was removed accidentally from the configuration, innodb essentially stopped the whole thing from coming up with the given traceback and messages.

Michael,

ok, the problem was that the new my.cnf was unaware of ibdata2 etc. that you had.

Hmm... the solution to situations like this would be to keep information of all ibdata files and ib_logfiles in ibdata1. Then InnoDB would detect that my.cnf is wrong. But a problem is that people can edit my.cnf and move ibdata files to different directories, split ibdata files etc., and that will not get reflected in ibdata1!

But in this particular case we COULD detect that the combined size of ibdata files that are specified in my.cnf is smaller than the tablespace size that InnoDB has stored internally. That is always a very serious error. I will look if we can easily notice this and print a warning to the .err log.

Thank you,

Heikki

Michael,

I tested 4.1.16. It does detect if the combined size of ibdata files is smaller than the tablespace size that is stamped in ibdata1. I cannot explain why in your case mysqld did start up. Do you have any theory?

Regards,

Heikki
heikki@127:~/mysql-4.1.16/sql$ ./mysqld
InnoDB: Error: tablespace size stored in header is 1920 pages, but
InnoDB: the sum of data file sizes is 640 pages
InnoDB: Cannot start InnoDB. The tail of the system tablespace is
InnoDB: missing. Have you edited innodb_data_file_path in my.cnf in an
InnoDB: inappropriate way, removing ibdata files from there?
InnoDB: You can set innodb_force_recovery=1 in my.cnf to force
InnoDB: a startup if you are trying to recover a badly corrupt database.
060206 19:38:56 [ERROR] Can't init databases
060206 19:38:56 [ERROR] Aborting

060206 19:38:56 [Note] ./mysqld: Shutdown complete

Michael,

ok, I found the explanation. InnoDB does the check of the tablespace size only at the end of its startup. In your case mysqld crashed before that.

Hmm... maybe the easiest thing is to change the error message that InnoDB DID print in your case.

Assigning this to Osku. He should add:

"InnoDB: If you get this error at mysqld startup, please check that your my.cnf\n"
"InnoDB: matches the ibdata files that you have in the MySQL server.\n"

This should be fixed in 5.0 and 5.1.

Thank you,

Heikki

Thanks sorry about not responding immediately, but yes this is exactly what I was hoping for.  I/we here probabyl won't get caught by it again, but this will help reduce confustion for others.  I have yet to open a debian bug ticket though to fix the root of this cause (dpkg being told by install scripts to blow a way a .cnf file)

Thanks!

Fixed in InnoDB snapshot368; fixes are in 5.0.20.

Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Documented bugfix in 5.0.20 changelog. Closed.

Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.

Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)

Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.

Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)