Bug #39842 signal 11 when using large_pages
Submitted: 3 Oct 2008 15:16 Modified: 21 Aug 2009 9:48
Reporter: Maksim Nikulin (OCA) Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: General Severity:S1 (Critical)
Version:5.0.67, 5.1+, 6.0+ OS:Linux
Assigned to: CPU Architecture:Any
Tags: Contribution, HugePages, large_pages, myisam

[3 Oct 2008 15:16] Maksim Nikulin
Description:
081002 13:56:26 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=17179869184
read_buffer_size=402653184
max_used_connections=3
max_connections=300
threads_connected=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 135970816 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x1367840
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x44048fd0, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
(nil)
New value of fp=0x1367840 failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/using-stack-trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do 
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x139dc10 = select userid from Users_2771
thd->thread_id=6085
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
081002 13:56:26  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
081002 13:56:27  InnoDB: Started; log sequence number 0 43685

How to repeat:
1. have linux kernel built w/ CONFIG_HUGETLB_PAGE+CONFIG_HUGETLBFS
2. set enough /proc/sys/vm/nr_hugepages, /proc/sys/kernel/shmmax, ulimit -l
3. enable large_pages
4. set key_buffer=4G, run mysqld, note value "cat /proc/meminfo | grep HugePages_Rsvd"
5. set key_buffer=5G, re-run mysqld, note HugePages_Rsvd, wonder why it's less than previous time
6. start some SELECTs from myisam to fill key_buffer
7. got signal 11 when HugePages_Rsvd became 0

Suggested fix:
--- old/mysys/my_largepage.c.orig       2008-08-04 16:19:45.000000000 +0400
+++ new/mysys/my_largepage.c    2008-10-02 16:59:59.000000000 +0400
@@ -121,7 +121,7 @@
   DBUG_ENTER("my_large_malloc_int");
 
   /* Align block size to my_large_page_size */
-  size = ((size - 1) & ~(my_large_page_size - 1)) + my_large_page_size;
+  size = ((size - 1) & ~((size_t)my_large_page_size - 1)) + my_large_page_size;
   
   shmid = shmget(IPC_PRIVATE, size, SHM_HUGETLB | SHM_R | SHM_W);
   if (shmid < 0)
[5 Oct 2008 9:57] Maksim Nikulin
it seems aprox. 900Mb of memory enough to repeat the error, so set key_buffer=4995M, then only one 2Mb HugePage enough to start mysql
[7 Oct 2008 12:33] Maksim Nikulin
w/ key_cache_block_size=16K and key_buffer=4153M my mysql eats 60Mb memory, but error still happens
[16 Jan 2009 0:59] Serge Knystautas
We are hitting the exact same error.  Environment is RHEL5 x64.  We have reproduced this with both MySQL 5.1.26 and 5.1.30.

Part of our log message was as follows:

================================================
key_buffer_size=32768
read_buffer_size=2097152
max_used_connections=69
max_threads=250
threads_connected=35
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1026571 K
bytes of memory
================================================
We've halved the read_buffer_size and sort_buffer_size to get below the potential threshold.  We'll see if this prevents the problem.  Unfortunately this is happening intermittently in a production environment and have not been able to recreate reliably.
[23 Jan 2009 9:37] Maksim Nikulin
Size of read_buffer or sort_buffer doesn't matter.
Problem recreated reliably each time "cat /proc/meminfo | grep HugePages_Rsvd" became zero and it doesn't matter what type of load applied to key_buffer.
Have you tried my patch?
[17 Aug 2009 10:06] Susanne Ebrecht
Bug #43606 could be related to this bug report here.
[21 Aug 2009 5:01] Susanne Ebrecht
I will set this as duplicate of bug #43606 because there is a patch given at the other bug report.
[21 Aug 2009 9:48] Maksim Nikulin
yes, that patch is more complicated than mine, may be someone'll add it to mainstream at long last