Bug #15468 Server hangs indefinitely occasionally without any known reason
Submitted: 4 Dec 2005 9:51 Modified: 4 Jan 2006 11:17
Reporter: Nadav Soferman Email Updates:
Status: No Feedback Impact on me:
Category:MySQL Server Severity:S1 (Critical)
Version:5.0.16 OS:Linux (Linux)
Assigned to: CPU Architecture:Any

[4 Dec 2005 9:51] Nadav Soferman
We have a production system running on a MySQL 5.0.16 DB server (after upgrading from version 4.1) and a replicated slave DB server with the same platform and DB version.
Since the upgrade, the server hangs once in a while (almost every day) without any relevant reason, errors in the log or any hint for the cause of the problem.

There are 2 kinds of failures:
  - The DB server doesn't respond. The TCP port answers, but no MySQL connection can be established. The server cannot be stopped. Even the OS 'kill' command doesn't work. Only 'kill -9' can stop the server.
  - The server do respond, but we see in 'processlist' many statements that wait for finishing for a long period of time (hours). 
  In this case, when trying to connect and update another simple table (a table that is irrelevant to all 'stuck' queries and statements), the update hangs and never returns. 
  Only a restart of the server helps.

It seems that these problems do not happen because of a specific query/statement.  

System description:
 - Platform of both the main production DB server (master) and the replicated DB server (slave):
 	- OS: Fedora Core release 3, Linux, 2.6.12-1.1381_FC3smp #1 SMP
 	- MySQL version: 5.0.16, 'standard', 'static'.
 	- File per table configuration.
 - DB Size:
	- ~25 tables.
	- ~20 million records in the biggest table. And much less in most other tables.
	- ~80GB complete DB size. ~60GB of it is in a table with raw data that is not frequently used.

 - Usage profile:
	- Constant process of loading data into the DB using 'insert' statements. 
	- Scheduled tasks (e.g. each hour) of running 'batch' queries. Such batch queries might take a few minutes (a series of such queries might take ~1 hour).
	- Interactively updating a certain number of records multiple times a day.		 

How to repeat:
I wish we knew...
[4 Dec 2005 11:17] Heikki Tuuri

during the hangs, please print:




and attache them to this bug report. Please print them during several hangs. The more info we have, the easier it is to diagnose what thread has hung and where.

From which 4.1.xx version did you upgrade?


[5 Jan 2006 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".