Bug #64690 non-reproducible occasional crashes
Submitted: 19 Mar 2012 11:41 Modified: 23 Mar 2012 9:31
Reporter: A Sieferlinger Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version: 5.1.61-0+squeeze1-log OS:Linux (Debian 6.0.4)
Assigned to: CPU Architecture:Any

[19 Mar 2012 11:41] A Sieferlinger
Description:
I experienced an issue now already for several weeks:

The mysqld crashes at a query with "mysqld got signal 11" (details see attachment). It restarts automatically, does a recover on the InnoDB and is running again then.
Sometime also tables are corrupted but could be fixed in most cases with an repair statement.

When the crashing query is executed again the server does not crash and just works fine.
As I did not find something in common between all the queries that caused a crash i suspect a different reason.

Details about the system:
Debian mySQL Package: 5.1.61-0+squeeze1-log
Debian 6.0.4
The server causing the problems is the master in a simple master slave setup. The slave has exactly the same specs.

Hardware specs:
32 GB Memory
2 Quadcores with Hyper Threading (Intel(R) Xeon(R) CPUL5520  @ 2.27GHz)
Hardware RAID 10 with BBU

How to repeat:
As mentioned above I did not found a way to reproduce the issues, as the same queries that caused a crash before works when i execute it manually.

Suggested fix:
None yet, mysqld should not crash.
[19 Mar 2012 11:42] A Sieferlinger
Log showing the crash and recover

Attachment: mysql.log.1 (application/octet-stream, text), 2.70 KiB.

[19 Mar 2012 14:34] Valeriy Kravchuk
Please, send your my.cnf file content.
[19 Mar 2012 14:38] A Sieferlinger
my.cnf from the affected server

Attachment: my.cnf (application/octet-stream, text), 4.75 KiB.

[19 Mar 2012 14:38] A Sieferlinger
i have added the config file to the files
[19 Mar 2012 15:56] Valeriy Kravchuk
Looks like your settings are too high, both for per-thread and for some global buffers. Can you, please, check if crashes ever happens after the following changes in my.cnf:

tmp_table_size 	        = 16M # was 192M
sort_buffer_size        = 1M # was 8M
read_buffer_size        = 1M # was 2M
read_rnd_buffer_size    = 1M # was 16M 
join_buffer_size        = 1M # was 8M
max_heap_table_size     = 16M # was 64M

myisam_use_mmap = 0 # was 1

query_cache_size        = 128M # was 1024M, hardly anything > 128M ever makes sense...

All the above was on top of 25G just for InnoDB buffer pool, with up to 1000 concurrent connections allowed and with only 32G of RAM. Hardly your configuration was reasonable or robust. 

IMHO you were hit by some kind of out of memory condition. You may even want to check OS level logs for any evidence.
[20 Mar 2012 7:09] A Sieferlinger
thanks for the hints:
as this is a running production system it may take some days until i can test the new settings.
I took over maintenance for this system just recently so not many checks were done on the system, it will only be online for about two more months
[20 Mar 2012 7:12] A Sieferlinger
just a short note regarding the memory issues:
in systems logs there were no infos that the OOM killer startet his job, only visible thing was a segfault in libc
[22 Mar 2012 12:51] A Sieferlinger
It seems like this was the issue. We had no more crash in the last days. If there should be crashes again I will reopnen this ticket.
[23 Mar 2012 9:31] Valeriy Kravchuk
For now I assume this problem was not caused by the bug, but by wrong configuration.