Bug #31038 | if the server locks up, mysql appears to write gibberish into the data file | ||
---|---|---|---|
Submitted: | 15 Sep 2007 5:29 | Modified: | 24 Sep 2007 17:44 |
Reporter: | Maurice Volaski | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 5.0.48 | OS: | Linux (Gentoo ) |
Assigned to: | Heikki Tuuri | CPU Architecture: | Any |
Tags: | database page corruption locking up corrupt doublewrite |
[15 Sep 2007 5:29]
Maurice Volaski
[15 Sep 2007 5:30]
Maurice Volaski
Here are the details of the crash 070915 1:14:01 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... InnoDB: Warning: database page corruption or a failed InnoDB: file read of page 206. InnoDB: Trying to recover it from the doublewrite buffer. InnoDB: Dump of the page: 070915 1:14:02 InnoDB: Page checksum 2267073817, prior-to-4.0.14-form checksum 3028325383 InnoDB: stored checksum 10432729, prior-to-4.0.14-form stored checksum 3028325383 InnoDB: Page lsn 0 514076, low 4 bytes of lsn at page end 514076 InnoDB: Page number (if stored to page already) 206, InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 0 InnoDB: Page may be an index page where index id is 0 24 InnoDB: Also the page in the doublewrite buffer is corrupt. InnoDB: Cannot continue operation. InnoDB: You can try to recover the database with the my.cnf InnoDB: option: InnoDB: set-variable=innodb_force_recovery=6
[17 Sep 2007 11:27]
Heikki Tuuri
Maurice, what Linux version on which hardware are you using? Do you use NFS or some other exotic file system? Page checksum errors on disk are probably caused by bad hardware or OS bugs. InnoDB is a transactional database. It should survive an OS crash or a power outage. Can you attach the entire .err log, gzipped? It is often the first error print which is the most interesting one. Regards, Heikki
[17 Sep 2007 12:23]
Heikki Tuuri
Here a large scale test from CERN about Linux file corruption: http://fuji.web.cern.ch/fuji/talk/2007/kelemen-2007-C5-Silent_Corruptions.pdf
[17 Sep 2007 12:42]
MySQL Verification Team
Please answer Heikki's question. Thanks in advance.
[17 Sep 2007 16:12]
Maurice Volaski
It is Gentoo 64-bit, with kernel 2.6.22-r3 and then before the last crash, r6. The filesystem is ext3 and it is running on top of drbd, which is a network RAID-1 kernel module. That was version 8.0.5 during the crashes. I am sending the error log which goes back a few weeks, so you will see several crashes on there. Many times after a dump was taken (3:10 AM timestamps) and perhaps a few other times. The last time was after the server locked up.
[17 Sep 2007 16:14]
Maurice Volaski
mysql error log
Attachment: mysqld.err.gz (application/x-gzip, text), 67.30 KiB.
[17 Sep 2007 16:49]
Heikki Tuuri
Maurice, my first guess is to suspect the RAID-1 driver.
[24 Sep 2007 17:41]
Maurice Volaski
This bug can be closed. The general consensus on the mailing lists is that this was due to faulty hardware and I indeed confirmed it was a bad PCI riser card.
[24 Sep 2007 17:44]
MySQL Verification Team
Thank you for the feedback.