Bug #36018 Server self crashes due to semaphore wait > 600 seconds
Submitted: 12 Apr 2008 17:45 Modified: 23 Jun 2008 17:46
Reporter: Gordon Shannon Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.0.45-community-log OS:Linux (x86_64)
Assigned to: Assigned Account CPU Architecture:Any

[12 Apr 2008 17:45] Gordon Shannon
Description:
This has happened twice in the past month.  Server crashes during heavy use complaining that a semaphore waited > 600 seconds.  Server restarts itself and appears fine afterwards.

From the log:

InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
080412 12:33:34InnoDB: Assertion failure in thread 1147169088 in file srv0srv.c line 2093
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.0/en/forcing-recovery.html
InnoDB: about forcing recovery.
080412 12:33:34 - mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388600
read_buffer_size=2093056
max_used_connections=17
max_connections=200
threads_connected=11
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 2055390 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
frame pointer is NULL, did you compile with
-fomit-frame-pointer? Aborting backtrace!
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0
080412 12:33:36  mysqld restarted

How to repeat:
Not repeatable.
[12 Apr 2008 18:05] MySQL Verification Team
Gordon, please upload the complete mysql error log, so we can examine the semaphores listed in the innodb status outputs.  Thanks
[13 Apr 2008 0:33] Gordon Shannon
Log file from crash

Attachment: log.txt (text/plain), 20.07 KiB.

[21 Apr 2008 13:58] Heikki Tuuri
Does not look like:

http://bugs.mysql.com/bug.php?id=29560

because there is a reader in the adaptive hash latch.

Need to study the .err log more carefully...
[11 May 2008 0:55] MySQL Verification Team
Have you tested with the latest released version if still continue the same behavior?. Thanks in advance.
[11 Jun 2008 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[11 Jun 2008 23:32] Gordon Shannon
This problem has not re-occurred since we upgraded the server to 5.1.24.
[23 Jun 2008 17:46] Heikki Tuuri
I am marking this as a duplicate of the fuzzy bundle of hang bugs: http://bugs.mysql.com/bug.php?id=20358
[23 Jun 2008 18:14] Mark Callaghan
Heikki,

Many threads get stuck waiting for the adaptive hash latch. The thread that holds the adaptive hash latch is blocked trying to lock dict_sys->mutex. What holds a lock on dict_sys->mutex?
[14 Jul 2008 15:05] Heikki Tuuri
Mark,

you mean what is blocking the first thread from entering the 'dict' in:
http://bugs.mysql.com/file.php?id=9085
?

I do not know :(. A thread should hold the dict mutex only for a short time, and definitely not try to reserve the btr_search latch during that time.

Regards,

Heikki
[14 Jul 2008 16:02] Mark Callaghan
I think I saw a bug like this locally and I think that eventually we found bad memory chips on the server.
[31 Oct 2009 5:14] James Day
Longest running thread is copying to tmp table so this looks like a duplicate of bug #32149 . First fix for the tmp table case in that one was released in 5.1.24-rc and the reporter here wrote above that "This problem has not re-occurred since we upgraded the server to 5.1.24".