Bug #18886 InnoDB: Database page corruption on disk or a failed file read of page
Submitted: 7 Apr 2006 14:30 Modified: 5 Jul 2006 14:44
Reporter: Philip Molter Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:4.1.16 OS:Linux (Linux (Fedora Core 4, x86_64))
Assigned to: Marko Mäkelä CPU Architecture:Any
Tags: 3ware

[7 Apr 2006 14:30] Philip Molter
Description:
I have a 3.23.58 system replicating to two 4.1.16 systems (this 4.1.16 system contains one patch, the one for the slow update performance bug fixed in 4.1.18).  Both of the slave systems received an update command last night which triggered both of them to crash with a page corruption error in InnoDB.  I will attach logs for each system.  You'll see that the corruption for both databases is identical.  That master system also slaves to another 3.23.58 system.  The older slave did not have any errors.  Dumping the data with select * into outfile recreating the table, and reloading the data allowed the slaving to continue without issue.  The data for both of these systems was created by doing a mysqldump of the master, then importing that dump via the source command (this allowed us to go from a shared-tablespace setup on the 3.23 master to a per-file tablespace setup on the two new slaves).

Both affected slave systems are dual Opteron 265 machines with 16GB of RAM running Fedora Core 4, kernel 2.6.16-1.2069_FC4 (latest release) built for x86_64.  They have 3ware 9500S 12-drive controllers using Linux software RAID (raid0 of raid1s for the database filesystem).  We have plenty of experience with this hardware setup and have never had hardware-related corruption with it.  The operating system is pretty verbose about reporting possible issues, both with memory errors and disk issues.  Given that it happened identically to two machines, I do not believe the issue is related to hardware.

Here is the relevant information from their MySQL configs:

[mysqld]
set-variable = key_buffer=128M
set-variable = table_cache=800
set-variable = sort_buffer=64M
set-variable = myisam_sort_buffer_size=128M
set-variable = long_query_time=3
set-variable = max_binlog_size=128M
set-variable = max_connections=500
set-variable = thread_cache_size=8
set-variable = thread_concurrency=8

delay-key-write-for-all-tables

default-table-type=InnoDB
innodb_data_file_path = ibdata1:2048M;ibdata2:2048M
innodb_file_per_table

set-variable = innodb_mirrored_log_groups=1
set-variable = innodb_log_files_in_group=4
set-variable = innodb_log_file_size=256M
set-variable = innodb_log_buffer_size=32M

innodb_flush_log_at_trx_commit=0
innodb_log_archive=0

set-variable = innodb_buffer_pool_size=1G
set-variable = innodb_additional_mem_pool_size=128M
set-variable = innodb_file_io_threads=4
set-variable = innodb_lock_wait_timeout=60

How to repeat:
Cannot reproduce, but did happen on two machines.

Suggested fix:
A corrupt page certainly shouldn't bring down the entire database process, especially since in a replicated environment, that just leads to repeated crashes as the system comes back up, tries to run the command that caused the problem, then immediately crashes again.
[7 Apr 2006 14:31] Philip Molter
Error log, server #1

Attachment: mysqld.gn.log.1 (application/octet-stream, text), 49.39 KiB.

[7 Apr 2006 14:31] Philip Molter
Error log, server #2

Attachment: mysqld.gn.log.2 (application/octet-stream, text), 49.39 KiB.

[7 Apr 2006 20:59] Heikki Tuuri
Philip,

please post the COMPLETE .err log from both slaves.

This bug report is remarkable, since this shows InnoDB probably corrupts a page just before writing it to the data file, or InnoDB corrupts it just after reading the page in. Or there is an extraordinarily deterministic OS/driver bug that causes corruption.

The lsn that is stamped to the start of the page is slightly higher than the lsn at the end.

InnoDB: Page lsn 9 2872311775, low 4 bytes of lsn at page end 2872310629

InnoDB checks that the lsn at the start and the end match just before writing to the file.

Regards,

Heikki

os0file.c:
        /* Do the i/o with ordinary, synchronous i/o functions: */
        if (slot->type == OS_FILE_WRITE) {
                if (array == os_aio_write_array) {
                        if ((total_len % UNIV_PAGE_SIZE != 0)
                            || (slot->offset % UNIV_PAGE_SIZE != 0)) {
                                fprintf(stderr,
"InnoDB: Error: trying a displaced write to %s %lu %lu, len %lu\n",
                                        slot->name, (ulong) slot->offset_high,
                                        (ulong) slot->offset,
                                        (ulong) total_len);
                                ut_error;
                        }

                        /* Do a 'last millisecond' check that the page end
                        is sensible; reported page checksum errors from
                        Linux seem to wipe over the page end */

                        for (len2 = 0; len2 + UNIV_PAGE_SIZE <= total_len;
                                                len2 += UNIV_PAGE_SIZE) {
                                if (mach_read_from_4(combined_buf + len2
                                                + FIL_PAGE_LSN + 4)
                                    != mach_read_from_4(combined_buf + len2
                                                + UNIV_PAGE_SIZE
                                        - FIL_PAGE_END_LSN_OLD_CHKSUM + 4)) {
                                        ut_print_timestamp(stderr);
                                        fprintf(stderr,
"  InnoDB: ERROR: The page to be written seems corrupt!\n");
                                        fprintf(stderr,
"InnoDB: Writing a block of %lu bytes, currently writing at offset %lu\n",
                                        (ulong)total_len, (ulong)len2);
                                        buf_page_print(combined_buf + len2);
                                        fprintf(stderr,
"InnoDB: ERROR: The page to be written seems corrupt!\n");
                                }
                        }
                }

                ret = os_file_write(slot->name, slot->file, combined_buf,
                                slot->offset, slot->offset_high, total_len);
[7 Apr 2006 21:34] Philip Molter
I can post more (I have 100MB of log file), but it's all just the same stuff I've posted as the server would get restarted by mysqld_safe, start up replication, read in the replicated command, and then crash with the same error.

Is there something specific in the log you're looking for?  I chopped out anything before the error (startup stuff from the last server startup) and anything after the error (Number of processes running now: 0, mysqld restarted, etc.).  I did verify that the error messages that were repeating were the same as that which I already posted.

I do want to correct something.  I said the code that caused the error was an UPDATE command.  That's false.  The error was a DELETE command:

060406 19:21:22 [ERROR] Slave: Error 'Got error -1 from storage engine' on query
. Default database: 'gn'. Query: 'DELETE FROM service_attr WHERE sid='146764' AN
D name='hash_password'', Error_code: 1030.

That line was pulled from a later log that was generated when the server was in recovery mode, hence the '-1 from storage engine' error (this was during recovery).
[7 Apr 2006 22:06] Heikki Tuuri
Philip,

ok, then  post all the .err log before the first error. I am interested in seeing how the InnoDB tablespace was created etc.

Regards,

Heikki
[8 Apr 2006 2:57] Philip Molter
Ahh, okay.  I have that information for only one of the servers.  The other server has since rotated it out.  I'll attach the log from the initial setup through the crash past the restart and up right until the point of the second crash.
[8 Apr 2006 2:58] Philip Molter
Error log, server #1 (full)

Attachment: mysqld.log (application/octet-stream, text), 63.22 KiB.

[11 Apr 2006 0:20] Heikki Tuuri
Philip,

the .err log does not show anything out of the ordinary.

This bug is really strange:

1) it is highly unlikely that any InnoDB memory corruption bug could wipe over the lsn at the start and the end of the page just before it gets written to the data file; there have never been any failure reports of the lsn check just before the write;

2) since the page read repeatedly gave the same error message, it is also unlikely that any InnoDB bug would so deterministically corrupt a page that is read in from a data file;

3) an OS bug / driver bug is unlikely to be so deterministic that it hits two slaves in the exact same way!

The execution in your slaves has been rather deterministic, as the page lsn is the same in both slaves.

Hmm... I could add to 5.0 an lsn check also AFTER a buffer has been written to the data file. If that check passes ok, then it is almost certain that this is an OS / driver bug.

Regards,

Heikki
[3 May 2006 12:27] Heikki Tuuri
Marko,

please add the 'last millisecond check' also AFTER a file write. This should be done in 4.1, 5.0, and 5.1. Simply put this code in os0file.c:

                if (array == os_aio_write_array) {
                        /* Do a 'last millisecond' check that the page end
                        is sensible; reported page checksum errors from
                        Linux seem to wipe over the page end */

                        for (len2 = 0; len2 + UNIV_PAGE_SIZE <= total_len;
                                                len2 += UNIV_PAGE_SIZE) {
                                if (mach_read_from_4(combined_buf + len2
                                                + FIL_PAGE_LSN + 4)
                                    != mach_read_from_4(combined_buf + len2
                                                + UNIV_PAGE_SIZE
                                        - FIL_PAGE_END_LSN_OLD_CHKSUM + 4)) {
                                        ut_print_timestamp(stderr);
                                        fprintf(stderr,
"  InnoDB: ERROR: The page to be written seems corrupt!\n");
                                        fprintf(stderr,
"InnoDB: Writing a block of %lu bytes, currently writing at offset %lu\n",
                                        (ulong)total_len, (ulong)len2);
                                        buf_page_print(combined_buf + len2);
                                        fprintf(stderr,
"InnoDB: ERROR: The page to be written seems corrupt!\n");
                                }
                        }
                }

also after the call of:

                ret = os_file_write(slot->name, slot->file, combined_buf,
                                slot->offset, slot->offset_high, total_len);
Regards,

Heikki
[3 May 2006 12:29] Heikki Tuuri
Of course, the patch does not fix any bug. But it can prove that the problem really is in the OS or hardware.
[24 May 2006 12:32] Jeetendra Mirchandani
Yesterday night 4 of my mysql servers crashed with the same error. Even I use mysql replication, and all my slaves crashed. But interestingly, the first slave (a single server) crashed before I pushed any changes to the master. 

Once I pushed the changes to master, all 4 slaves crashed(It was an UPDATE query). So I dont yet beleive that it was due to a master slave config. I am running in this config since more that 4 months, but this kind of crash has happened for the first time!

I use mysql 4.1.10a
My INNODB config:
 innodb_additional_mem_pool_size : 20971520
 innodb_autoextend_increment : 8
 innodb_buffer_pool_awe_mem_mb : 0
 innodb_buffer_pool_size : 1073741824
 innodb_data_file_path : ibdata1:10M:autoextend
 innodb_data_home_dir :
 innodb_fast_shutdown : ON
 innodb_file_io_threads : 4
 innodb_file_per_table : ON
 innodb_flush_log_at_trx_commit : 0
 innodb_flush_method :
 innodb_force_recovery : 0
 innodb_lock_wait_timeout : 50
 innodb_locks_unsafe_for_binlog : OFF
 innodb_log_arch_dir :
 innodb_log_archive : OFF
 innodb_log_buffer_size : 8388608
 innodb_log_file_size : 5242880
 innodb_log_files_in_group : 2
 innodb_log_group_home_dir : ./
 innodb_max_dirty_pages_pct : 90
 innodb_max_purge_lag : 0
 innodb_mirrored_log_groups : 1
 innodb_open_files : 300
 innodb_table_locks : ON
 innodb_thread_concurrency : 20

Any clues to this problem?
Will upgrading to 4.1.19 solve my problem?

Interestingly again, I am running on AMD Opetron, with 8GB RAM on each.
(Dual Proc AMD Opteron(tm) Processor 250)
All posts I have seen about susch a crash, involve AMD Opetron! Does this suggest something?

Thanks,
Jeetu
[24 May 2006 12:58] Jeetendra Mirchandani
Forgot to mention:
I am running RHEL3 : 2.4.21-31.ELsmp
Will attach logs from one server.
[24 May 2006 13:01] Jeetendra Mirchandani
Not allowed to attach files :(
If required, give me an email address, i'll send you gzipped log.
Jeetu
(jeetum at  gmail dot com)
[20 Jun 2006 13:53] Heikki Tuuri
Jeetendra,

I assume that you did not get corruption in the slaves and it was just an ordinary crash? Then it is probably some deterministic bug in MySQL. An upgrade to 4.1.xx may well solve the problem.

Regards,

Heikki
[20 Jun 2006 14:42] Jeetendra Mirchandani
Yes, it was on a slave. But later since then, I have had crashes on individual boxes (all slaves), in different databases, even the ones that are not replicated. 
I am planning to upgrade to 4.1.20 soon (I am currently using 4.1.10a) . Will keep you all posted if I see the same page corruption problem after upgrading.

Jeetu
[27 Jun 2006 18:50] Philip Molter
Heikki,

Thank you very much for taking the time to look at this.  After some very thorough debugging on our end, it appears as though the corruption was caused by a bug in the firmware on our 3ware card.  I was able to get the corruption to be fairly easily reproduced on our systems, and once I upgraded to a newer firmware release, 2.06.00.009, the problem hasn't shown up again (3 days and counting, despite my best efforts to cause it).

I think the consistency from the initial bug report was a coincidence.  Subsequent tests showed a couple more consistent failures, but also many inconsistent failures, and doing tests with other, non MySQL software and the older firmware release, we were also able to produce similar forms of corruption.
[5 Jul 2006 4:11] Elliot Murphy
Can this safely be closed as 'Not a bug'?
[5 Jul 2006 12:26] Philip Molter
From my standpoint, the bug is fixed.  I'm not sure about the others that have attached messages to this bug.
[5 Jul 2006 12:29] Philip Molter
I should say, 'Not a bug', not 'fixed'.
[5 Jul 2006 12:33] Jeetendra Mirchandani
My problem got solved by upgrading the kernel on my RHEL 3 system.
[5 Jul 2006 14:44] Elliot Murphy
Thanks for the confirmation.
[19 Sep 2006 19:32] Jeetendra Mirchandani
I have documented my side of the problem here:
http://www.jeetu.isa-geek.com/rhel3mysqlbug

Strange that a combination of some hardware and some software version puts you in a really bad situation.

Thanks everyone who contributed to this bug (or whatever it was :-) )
[5 May 2010 15:19] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 5:45] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:15] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:43] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 11:45] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:23] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:10] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)