Bug #33718 Threads hanging in SENDING DATA probably after Apache/PHP clients crashed
Submitted: 7 Jan 2008 11:05 Modified: 20 Mar 2008 12:11
Reporter: Christian Hammers (Silver Quality Contributor) (OCA) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.45-linux-x86_64-icc-glibc23 OS:Linux (Debian 4.0 with Kernel 2.6.18-5-amd64 )
Assigned to: CPU Architecture:Any
Tags: qc

[7 Jan 2008 11:05] Christian Hammers
Description:
Hello

Every couple of days the number of MySQL threads suddenly increases and the queries (mostly the same SELECTs from a bit web site) get no longer processed.
Executing them from /usr/bin/mysql also hangs as if the table/row is locked.

The process list does not show any locks but a strange high number of threads in
SENDING DATA mode. If I restart apache2/php5 on the clients they segfault so I assume that I have a bug there which keeps MySQL hanging during the delivering of the result data via TCP.

My "wait_timeout" is set to 60s but still MySQL seems not to realize that the other side has "hung up" and is no longer receiving data.

Example of the process list - see the high values in the Time column, the query 
result is just one short line:

+---------+------+--------------------+-----+---------+------+---------------------------+------------------------------------------------------------------------------------------------------+
| Id      | User | Host               | db  | Command | Time | State                     | Info                                                                                                 |
+---------+------+--------------------+-----+---------+------+---------------------------+------------------------------------------------------------------------------------------------------+
| 4347573 | xxx  | 192.168.1.14:59828 | xxx | Query   | 0    | Sorting result            | SELECT articles_image_assets.ARTICLES_ID, articles_image_assets.IMAGE_ASSETS_ID, articles_image_asse |
| 4347655 | xxx  | 192.168.1.14:59864 | xxx | Query   | 6    | Sending data              | SELECT articles_image_assets.ARTICLES_ID, articles_image_assets.IMAGE_ASSETS_ID, articles_image_asse |
| 4347678 | xxx  | 192.168.1.14:59887 | xxx | Query   | 6    | Sending data              | SELECT articles_image_assets.ARTICLES_ID, articles_image_assets.IMAGE_ASSETS_ID, articles_image_asse |
| 4348018 | xxx  | 192.168.1.14:35263 | xxx | Query   | 7    | Sorting result            | SELECT articles.ID, articles.LANGUAGE, articles.LOCKING_USER, articles.LOCK_TIMESTAMP, articles.LIVE |
| 4348136 | xxx  | 192.168.1.14:35353 | xxx | Query   | 3563 | Sending data              | SELECT articles.ID, articles.LANGUAGE, articles.LOCKING_USER, articles.LOCK_TIMESTAMP, articles.LIVE |
| 4348218 | xxx  | 192.168.1.15:36830 | xxx | Query   | 3552 | Sending data              | SELECT articles.ID, articles.LANGUAGE, articles.LOCKING_USER, articles.LOCK_TIMESTAMP, articles.LIVE |
| 4349122 | xxx  | 192.168.1.14:35888 | xxx | Query   | 3470 | Sending data              | SELECT articles.ID, articles.LANGUAGE, articles.LOCKING_USER, articles.LOCK_TIMESTAMP, articles.LIVE |
| 4349408 | xxx  | 192.168.1.14:36017 | xxx | Query   | 3449 | Sending data              | SELECT articles.ID, articles.LANGUAGE, articles.LOCKING_USER, articles.LOCK_TIMESTAMP, articles.LIVE |
| 4349862 | xxx  | 192.168.1.15:54701 | xxx | Query   | 3394 | Sending data              | SELECT articles.ID, articles.LANGUAGE, articles.LOCKING_USER, articles.LOCK_TIMESTAMP, articles.LIVE |
...
Uptime: 432276  Threads: 700  Questions: 47377935  Slow queries: 34725  Opens: 4462  Flush tables: 1  Open tables: 4438  Queries per second avg: 109.601

bye,

-christian-

More detail information is attached but to be handled as PRIVATE customer data!

How to repeat:
Maybe by building a client that crashes while reading data from a TCP socket?

Suggested fix:
none
[7 Jan 2008 12:21] Susanne Ebrecht
We're sorry, but the bug system is not the appropriate forum for asking help on using MySQL products. Your problem is not the result of a bug.

Support on using our products is available both free in our forums at http://forums.mysql.com/ and for a reasonable fee direct from our skilled support engineers at http://www.mysql.com/support/

Thank you for your interest in MySQL.
[8 Jan 2008 9:19] Susanne Ebrecht
Christian,

unfortunately, I can't follow you here. To reproduce this, I need to know exactly, what you did. Maybe a little test would help.

Please. also try our newest version MySQL 5.0.51.
[8 Jan 2008 9:42] Christian Hammers
Hello

Ok, so in short the problem is the following:

  MySQL has some threads which process a query that just returns a single row
  with only a couple of bytes but runs since 4000s in "Sending data" state.

  The wait_timeout=60 setting usually disconnects such stalled clients but in
  certain situations it apparently does not work.

I am not able to give a test case or to reproduce this bug, it just happens
every now and then.
My assumtion is that it happens because my client, a apache+php process on a different server, crashes quite often with a segfault.

MySQL should handle such faulty clients and disconnect them after they do no
longer send/receive data for $wait_timeout seconds.

I hope this was clearer now.

bye,

-christian-
[8 Jan 2008 14:08] Susanne Ebrecht
Christian,

like we discussed on IRC, we need more specific informations to handle this.

Susanne
[8 Jan 2008 14:18] Davi Arnaut
Hi Christian,

Can you provide us with a netstat output and a gdb backtrace of all threads (when the problem is occurring)? It might help us to nail down the problem.

Thanks
[9 Feb 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[18 Feb 2008 18:01] Susanne Ebrecht
Christian,

I set this to "need feedback" again.
[19 Mar 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[19 Mar 2008 10:59] Tonci Grgin
IMHO, this still needs either feedback or test case. Susanne?
[19 Mar 2008 11:27] Susanne Ebrecht
Christian,

we set this to need feedback again. Because we are waiting of more analysing from you.
[19 Mar 2008 20:13] Christian Hammers
Hi

We did not see this bug on our production servers again but probably rather because we modified the web application and the clients so that we don't *trigger* the bug anymore. As the mysql server version stayed the same the bug can't be gone. Sadly, I've currently no time to try to reproduce this on a test system.
So you can close the report until sombody else stubles over it and reopenes it.

bye,

-christian-
[20 Mar 2008 12:11] Susanne Ebrecht
Christian,

I will set this into "Can't repeat" status. So when you ever will get the problem again you can set it back to open.
[1 Feb 2009 9:29] David Yan
I'm seeing the same thing. It happens every couple days. The load average shoots up when this occurs.  Whenever this happens, it will go back to normal after I have killed all the threads that get stuck.

$ mysqld --version
mysqld  Ver 5.0.51a-3ubuntu5.4-log for debian-linux-gnu on x86_64 ((Ubuntu))
$ uname -a
Linux db1.beijing.facekoo.com 2.6.24-23-server #1 SMP Thu Nov 27 18:45:02 UTC 2008 x86_64 GNU/Linux

This has never happened on my 32bit system (with the same MySQL version).  It seems to only happen on 64bit systems.