MySQL Bugs: #10447: Corrupted table space after 600 sec semaphore wait

Bug #10447	Corrupted table space after 600 sec semaphore wait
Submitted:	8 May 2005 13:32	Modified:	25 May 2005 19:55
Reporter:	Johann-Peter Hartmann	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S1 (Critical)
Version:	4.0.21	OS:	Linux (Redhat Enterprise ES 3 (Taroon))
Assigned to:	Heikki Tuuri	CPU Architecture:	Any

Description:
Hi all, 

We have an intentional reboot resulting in a corrupted innodb table space.

First we get an intentional crash caused by a hung semaphore (more than 10 minutes): 

InnoDB: We intentionally crash the server, because it appears to be hung.
050504 10:34:15InnoDB: Assertion failure in thread 1932823472 in file
sync0arr.c line 926
InnoDB: We intentionally generate a memory trap.

The following reboot contains lot of InnoDB log errors: 

InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 33 965915984
InnoDB: Doing recovery: scanned up to log sequence number 33 965915648
050504 10:34:19  InnoDB: ERROR: We were only able to scan the log up to
InnoDB: 33 965915648, but a checkpoint was at 33 965915984.
InnoDB: It is possible that the database is now corrupt!

The evening before we got lots of innodb log error messages like 

050503 20:31:24  InnoDB: ERROR: the age of the last checkpoint is 480653972,
InnoDB: which exceeds the log group capacity 320860570.
InnoDB: If you are using big BLOB or TEXT rows, you must set the
InnoDB: combined size of log files at least 10 times bigger than the
InnoDB: largest such row.

How to repeat:
We don't know how to repeat this error, since we are not sure what the exact reason for this semaphore is.

Hi!

The semaphore hang may be a bug associated with temp tables. I have to check if that bug has been fixed already.

The really serious failure was that InnoDB was not able to scan its log file after the crash! That kind of failure is very rare. Are you sure that your my.cnf was pointing to the right ib_logfiles? If someone had edited it meanwhile, or changed ib_logfiles in a running mysqld server, that might explain the error.

Are you sure that you did not have another mysqld process running concurrently, and using those same ib_logfiles? That could explain the failure.

Are you using NFS or some other exotic file system?

Please attach the complete, unedited .err log from that server.

Regards,

Heikki

Hi!

I could not find any fix for a bug like this in 4.0.22 - 4.0.24.

I tested 4.0.21 on Linux with a heavy load of temporary table creation and dropping, along with other write-heavy workload. But I could not get a hang of mysqld.

The printout looks like the drop table operation would be hung somewhere. It is not waiting for any InnoDB semaphore.

My guess is some OS bug or hardware malfunction. That would explain both the hang and the more serious problem: log files were wiped out.

I am putting this bug report to the Can't repeat status. If you experience more problems, please add a comment to this bug report.

Regards,

Heikki