Bug #18410 | InnoDB: Assertion failure in thread 1517779888 in file btr0cur.c line 3568 | ||
---|---|---|---|
Submitted: | 22 Mar 2006 2:39 | Modified: | 29 Apr 2006 11:43 |
Reporter: | K Hopkins | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 4.1.18 | OS: | Linux (Linux) |
Assigned to: | Assigned Account | CPU Architecture: | Any |
[22 Mar 2006 2:39]
K Hopkins
[22 Mar 2006 7:35]
Heikki Tuuri
Hi! What migration did you attempt? How did you copy the files to another computer? What is your Linux distro? Looks like a portion of the data file has been written to a file address 48 kB too small. My first guess is an OS bug or hardware failure. Regards, Heikki
[22 Mar 2006 12:16]
K Hopkins
Hi Heikki! Thanks for the feedback. > What migration did you attempt? How did you copy the files to another computer? I'm migrating a dbmail 1.x database to 2.0 format, both on InnoDB. It's done with a SQL script, which alters tables and initializes new fields. The source table, messageblks has a length of 4169940992, and is by far the largest table in the database. To copy the files, I shut down MySQL, tar -cz the entire mysql directory, scp it over to a HP workstation (which is bigger/faster & has more free disk than the laptop that the db normally runs on), where I untar it into an empty /var/lib/mysql directory. The tarred file is about 3.5GB. /etc/my.cnf has been updated on the HP workstation for the new database. I then start MySQL. v4.1.18 is installed from the same rpms on both hosts. Only one instance of MySQL is running on either host. >What is your Linux distro? SuSE Pro Linux 9.3, up-to-date patches. >Looks like a portion of the data file has been written to a file address 48 kB too small. Does that mean the database is corrupt before I start? Would mysqlcheck catch this? What can? > My first guess is an OS bug or hardware failure. I can run memtest over the weekend (as it does double duty as a production server too), but the HP workstation is working perfectly otherwise. Is there any benefit to run the migration several times to see if it aborts in the same place each time? I'd expect hardware failures to occur in random parts of the code. It is sitting on a reiserfs over lvm over md/raid1 over PATA disks. I've often used reiserfs/lvm/md/SuSE9.3 elsewhere without problems (but only with lightweight MySQL databases). Are any alarm bells ringing from what I've described? Regards, Keith
[22 Mar 2006 12:42]
Heikki Tuuri
Keith, the interesting question is whether the file corruption happens in the file copying phase, or whether it is the original ibdata files that are corrupt. InnoDB's ibdata files form one logical tablespace. You seem to have corruption in a 16 kB page starting at offset 8,150,302,720 bytes from the start of the tablespace, and in a few subsequent pages. Please use dd or some similar tool to copy a few hundred bytes from that offset on in the original database, so that I can check if the page number is right in the original database. You can email the bytes to heikki.tuuri@oracle.com No bug has ever been found in InnoDB that could explain the file corruption cases that have been reported from Linux in the past 5 years. I recall in one case, a page had been shifted 4 kB or 8 kB from the right position. What are your LVM configuration parameters? Can a 48 kB shift somehow be connected to the parameter values? You should test the disk subsystem of your computer. Regards, Heikki /* The byte offsets on a file page for various variables */ #define FIL_PAGE_SPACE_OR_CHKSUM 0 /* in < MySQL-4.0.14 space id the page belongs to (== 0) but in later versions the 'new' checksum of the page */ #define FIL_PAGE_OFFSET 4 /* page offset inside space */ #define FIL_PAGE_PREV 8 /* if there is a 'natural' predecessor of the page, its offset */ #define FIL_PAGE_NEXT 12 /* if there is a 'natural' successor of the page, its offset */ #define FIL_PAGE_LSN 16 /* lsn of the end of the newest modification log record to the page */ #define FIL_PAGE_TYPE 24 /* file page type: FIL_PAGE_INDEX,..., 2 bytes */ #define FIL_PAGE_FILE_FLUSH_LSN 26 /* this is only defined for the first page in a data file: the file has been flushed to disk at least up to this lsn */ #define FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID 34 /* starting from 4.1.x this contains the space id of the page */ #define FIL_PAGE_DATA 38 /* start of the data on the page */
[29 Apr 2006 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".