MySQL Bugs: #36018: Server self crashes due to semaphore wait

Bug #36018	Server self crashes due to semaphore wait > 600 seconds
Submitted:	12 Apr 2008 17:45	Modified:	23 Jun 2008 17:46
Reporter:	Gordon Shannon	Email Updates:
Status:	Duplicate	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S2 (Serious)
Version:	5.0.45-community-log	OS:	Linux (x86_64)
Assigned to:	Assigned Account	CPU Architecture:	Any

Description:
This has happened twice in the past month. Server crashes during heavy use complaining that a semaphore waited > 600 seconds. Server restarts itself and appears fine afterwards.

From the log:

InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
080412 12:33:34InnoDB: Assertion failure in thread 1147169088 in file srv0srv.c line 2093
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.0/en/forcing-recovery.html
InnoDB: about forcing recovery.
080412 12:33:34 - mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388600
read_buffer_size=2093056
max_used_connections=17
max_connections=200
threads_connected=11
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 2055390 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
frame pointer is NULL, did you compile with
-fomit-frame-pointer? Aborting backtrace!
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0
080412 12:33:36 mysqld restarted

How to repeat:
Not repeatable.

Gordon, please upload the complete mysql error log, so we can examine the semaphores listed in the innodb status outputs.  Thanks

Log file from crash

Attachment: log.txt (text/plain), 20.07 KiB.

Does not look like:

http://bugs.mysql.com/bug.php?id=29560

because there is a reader in the adaptive hash latch.

Need to study the .err log more carefully...

Have you tested with the latest released version if still continue the same behavior?. Thanks in advance.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

This problem has not re-occurred since we upgraded the server to 5.1.24.

I am marking this as a duplicate of the fuzzy bundle of hang bugs: http://bugs.mysql.com/bug.php?id=20358

Heikki,

Many threads get stuck waiting for the adaptive hash latch. The thread that holds the adaptive hash latch is blocked trying to lock dict_sys->mutex. What holds a lock on dict_sys->mutex?

Mark,

you mean what is blocking the first thread from entering the 'dict' in:
http://bugs.mysql.com/file.php?id=9085
?

I do not know :(. A thread should hold the dict mutex only for a short time, and definitely not try to reserve the btr_search latch during that time.

Regards,

Heikki

I think I saw a bug like this locally and I think that eventually we found bad memory chips on the server.

Longest running thread is copying to tmp table so this looks like a duplicate of bug #32149 . First fix for the tmp table case in that one was released in 5.1.24-rc and the reporter here wrote above that "This problem has not re-occurred since we upgraded the server to 5.1.24".