Bug #40968 Server hang without any error messages
Submitted: 24 Nov 2008 8:49 Modified: 28 Feb 2009 0:31
Reporter: Kenneth Lee Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:5.1.28rc, 5.1.30 OS:Linux (2.6.18-92.el5 x86_64)
Assigned to: CPU Architecture:Any

[24 Nov 2008 8:49] Kenneth Lee
Description:
My heavy server hangs sometimes(2~3 times per day) without any error messages.
Server has 16 CPUs, 64GB RAM and runs over 10,000 queries per second average.
It has about 14 slave servers replicated.

When server hangs, processlist count reaches to about 2300~2400 because my configuration has 3,000 value of max_connections. After 1-2 minutes, I must kill mysqld process and restart it because no response from MySQL server. Error log file(???.err) has no information about it and I can't find any system and mysqld related log message.
I attached last part of *.status file after server hangs and normal *.status file respective.

This problem continues from 5.1.23. I updated version to 5.1.28rc, But not fixed.
How can I fix this problem?

[After server hangs]
--------
FILE I/O
--------
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
I/O thread 3 state: waiting for i/o request (write thread)
Pending normal aio reads: 0, aio writes: 0,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
219904 OS file reads, 1267115 OS file writes, 1031995 OS fsyncs
81.56 reads/s, 16384 avg bytes/read, 161.49 writes/s, 36.37 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 4103, seg size 4105,
158455 inserts, 158455 merged recs, 52107 merges
Hash table size 70803119, node heap has 4669 buffer(s)
0.00 hash searches/s, 176.49 non-hash searches/s
---
LOG
---
Log sequence number 941 4041775005
Log flushed up to   941 4041775005
Last checkpoint at  941 4041775005
0 pending log writes, 0 pending chkp writes
990389 log i/o's done, 8.56 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 37671653496; in additional pool allocated 16756224
Dictionary memory allocated 1562264
Buffer pool size   2097152
Free buffers       1815589
Database pages     276894
Modified db pages  0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 229919, created 46975, written 424971
81.56 reads/s, 0.00 creates/s, 149.80 writes/s
Buffer pool hit rate 930 / 1000
--------------
ROW OPERATIONS
--------------
98 queries inside InnoDB, 2386 queries in queue
1 read views open inside InnoDB
Main thread process no. 21793, id 1175697728, state: waiting for server activity
Number of rows inserted 900648, updated 569445, deleted 830864, read 17961430
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT

[Normal status]
--------
FILE I/O
--------
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
I/O thread 3 state: waiting for i/o request (write thread)
Pending normal aio reads: 0, aio writes: 0,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
169278 OS file reads, 935853 OS file writes, 819857 OS fsyncs
17.37 reads/s, 19448 avg bytes/read, 340.42 writes/s, 316.98 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 459, free list len 3645, seg size 4105,
132205 inserts, 40460 merged recs, 8795 merges
Hash table size 70803119, node heap has 4383 buffer(s)
2405.85 hash searches/s, 21359.10 non-hash searches/s
---
LOG
---
Log sequence number 942 2353450893
Log flushed up to   942 2353450793
Last checkpoint at  942 1785103090
1 pending log writes, 0 pending chkp writes
802166 log i/o's done, 313.04 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 37632780616; in additional pool allocated 16777216
Dictionary memory allocated 1599504
Buffer pool size   2097152
Free buffers       1877783
Database pages     214986
Modified db pages  42882
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 174358, created 40628, written 232168
20.62 reads/s, 9.12 creates/s, 36.94 writes/s
Buffer pool hit rate 1000 / 1000
--------------
ROW OPERATIONS
--------------
3 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread process no. 5796, id 1181030720, state: sleeping
Number of rows inserted 728063, updated 487422, deleted 671178, read 12623023
289.23 inserts/s, 166.86 updates/s, 300.54 deletes/s, 3432.29 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

How to repeat:
Random, It doesn't depend on server load.
[24 Nov 2008 14:31] Valeriy Kravchuk
Thank you for a problem report. Please, try to repeat with a newer version, 5.1.29. In case of the same problem, please, send the entire results of SHOW INNODB STATUS several minutes after hang.
[8 Dec 2008 7:51] Kenneth Lee
I updated to version 5.1.30, but still hanging occurs. 
Luckily frequency of hanging lowered 2-3 times per week in comparison with 7~10 times per week.

I attached result of SHOW INNODB STATUS.
[8 Dec 2008 22:49] Mikhail Izioumtchenko
doesn't necessarily look like Innodb bug. Could you get the process stack
of the hanging mysqld, with pstack or gdb?
[30 Jan 2009 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[30 Jan 2009 0:31] MySQL Verification Team
Still feedback is needed.
[4 Feb 2009 20:33] Peter Garmaz
I am also experiencing this issue while running 5.1.30(GA) on Debian Etch. I'm running a 4 server Mulit-Master set replication setup with 1 extra machine as a slave for back up purposes. All 5 servers are running 2.6.18-6 kernels (some 32 bit, some 64 bit) and are running 5.1.30(GA).

The problem has only demonstrated itself only on the 4 master servers. The frequency of the issue is intermittent at best.

I know my replication cycle works normally as I was running 5.0 for some time previous to upgrading to 5.1 about week.

The only difference I have in regards to Kenneth's original bug report is that we are are running about 80 or so MyISAM tables and only 2 innodb tables.

The next time one of my servers demonstrates the problem (which is likely to be today) I shall get a pstack dump and attach it. 

p.
[1 Mar 2009 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[18 Mar 2009 9:10] Kelvin Liang
I get a similar issue like this while running with 5.1.32(GA).

I can connect with the mysql client but can't query data from table. 
No error messages were dropped to the logfiles. 
After restarting, everything works fine till the next hang.
[18 Mar 2009 15:35] Mikhail Izioumtchenko
we can't say it's an InnoDB bug until we have some evidence. It could be
bug#41163, for example