| Bug #47183 | long semaphore wait terminates server | ||
|---|---|---|---|
| Submitted: | 8 Sep 3:45 | Modified: | 15 Oct 18:42 |
| Reporter: | Andrew Dalgleish | ||
| Status: | Verified | ||
| Category: | Server: InnoDB | Severity: | S2 (Serious) |
| Version: | 5.1.30 | OS: | Linux |
| Assigned to: | Marko Mäkelä | Target Version: | |
| Triage: | Triaged: D5 (Feature request) | ||
[8 Sep 3:45]
Andrew Dalgleish
[8 Sep 7:30]
Susanne Ebrecht
Verified as described
[8 Sep 11:32]
Susanne Ebrecht
Bug #45988 is a duplicate of this bug here
[8 Sep 18:16]
Michael Izioumtchenko
not verified, no repeatable test case. Regarding: Can the 240s and 600s timeouts be made configurable? feature request,not a bug. imo the numbers are quite big already. 600 seconds is a lot for a computer application. A single row insert of 900 seconds and a commit of 6000 seconds would indicate a problem, most likely an application one. Could you tell us more information about the application and the system performance in general, particularly, if the problem in question was a onetime one or if there is a general lack of performance.
[17 Sep 9:27]
Susanne Ebrecht
For what do you need a test case? We made source code analysis here. You just need to look into srv/srv0src.c line 2122 in 5.1.30. It is hard coded there.
[17 Sep 15:41]
Heikki Tuuri
We could make the timeout configurable. If 600 seconds is too short, make it 7200 seconds (= 2 hours).
[17 Sep 15:43]
Heikki Tuuri
This may be a hang. Please post the entire .err log so that we see what InnoDB is doing during this excessive 600 seconds.
[25 Nov 11:28]
Harry Jackson
We currently have the same bug on mysql 5.1.30 Our server is killed with the following message: InnoDB: Error: semaphore wait has lasted > 600 seconds InnoDB: We intentionally crash the server, because it appears to be hung. Is it just me or does anyone else find it odd that a database server is "deliberately crashed" based on a lock issue. Is the only solution for this to rollback to mysql 4.n? We are aware that there are probably several application issues that are causing this but the fact that we now know MySQL is deliberately crashed if we cock up and us only being human means that we now cannot depend on the database to stay up through thick and thin. I've seen some other threads ask that the timeout setting be made configurable. While this might provide some relief I would not regard this as a solution. The main Database process should never crash intentionally or otherwise. I know this is a high standard to set and in some cases ie connection flooding, impossible to attain, but I still believe it should be the aim of any database to remain up.
[25 Nov 11:46]
Marko Mäkelä
The 600-second watchdog should have nothing to do with user locks (InnoDB table and record locks). It is about InnoDB mutexes and rw-locks, which should never be held for excessive periods of time. For example, whenever InnoDB transfers control to the operating system (e.g., for I/O) or to MySQL (e.g., returning result to client and waiting for next command), it should release all mutexes and rw-locks that are not necessary. In CHECK TABLE, this timeout is advanced by 2 hours (7200 seconds) in order to fix Bug #2694. Removing the 600-second watchdog timeout altogether is analogous to shooting the messenger. What we need is to find out why the watchdog is kicking in. Can you obtain a core dump and post stack traces? Something like this: gdb mysqld core set height 0 set log on thr apply all backtrace full quit and attach gdb.txt.
