Bug #4572 --disable-largefile fixes setrlimit() related server crash on startup
Submitted: 16 Jul 2004 2:29 Modified: 23 Sep 2004 19:14
Reporter: Martin Mokrejs Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:4.0.20 OS:i686 Linux 2.4.26 & 2.6.7-bk20
Assigned to: Assigned Account CPU Architecture:Any

[16 Jul 2004 2:29] Martin Mokrejs
Description:
ribosome local # ulimit -n 1034000
ribosome local # ulimit -n 4096000
-bash: ulimit: open files: cannot modify limit: Operation not permitted
ribosome local # sysctl -w fs.file-max=4096000
fs.file-max = 4096000
ribosome local # uname -a
Linux ribosome 2.4.26-pre3 #2 SMP Mon Mar 15 00:56:24 CET 2004 i686 Intel(R) Pentium(R) 4 CPU 2.60GHz GenuineIntel GNU/Linux
ribosome local # mysql-debug-4.0.20-pc-linux-i686/bin/mysqld
040716  1:33:24  Warning: Asked for 196608 thread stack, but got 126976
040716  1:33:24  Warning: setrlimit couldn't increase number of open files to more than 1034000 (request: 1048596)
ribosome local # 

Somehow the debug binary doesn't crash, but still does exit:

ribosome local # /usr/local/mysql-standard-4.0.20-pc-linux-i686-icc/bin/mysqld
040716  1:34:53  Warning: setrlimit couldn't increase number of open files to more than 1034000 (request: 1048596)
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388608
read_buffer_size=1044480
max_used_connections=0
max_connections=10
threads_connected=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 28631 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x84a4d10
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Stack range sanity check OK, backtrace follows:
Stack trace seems successful - bottom reached
Please read http://www.mysql.com/doc/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do 
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x48474645  is invalid pointer
thd->thread_id=1885299813
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.
ribosome local # 

./configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share  --sysconfdir=/etc --localstatedir=/var/lib --libexecdir=/usr/sbin --sysconfdir=/etc/mysql --localstatedir=/var/lib/mysql --with-raid --with-low-memory --enable-assembler --with-charset=latin1 --enable-local-infile --with-mysqld-user=mysql --with-extra-charsets=all --enable-thread-safe-client --with-client-ldflags=-lstdc++ --with-comment='Gentoo Linux mysql-4.0.20' --with-unix-socket-path=/var/run/mysqld/mysqld.sock --with-embedded-server --with-berkeley-db=./bdb --without-readline --enable-shared  --enable-static --with-libwrap --with-innodb --with-vio --with-openssl --without-debug

I have seen this bug many many times, but never got more details on this. This time I was inspired by this thread: http://lists.mysql.com/mysql/93051

How to repeat:
Reproduced on 2 linuxes 2.4.26-pre3 and 2.6.7-bk20. Tested on both hosts with glibc 2.3.3 and 2.3.4, no change. On both tried mysql-4.0.18 and 4.0.20.

Appending --disable-largefile to the above configure line does fix the mysqld server crash on one host. The setrlimit warning still persists, but server exits on 2.4.26-pre3 host. strace shows:
getrlimit(RLIMIT_NOFILE, {rlim_cur=1034000, rlim_max=1034000}) = 0
setrlimit(RLIMIT_NOFILE, {rlim_cur=1048596, rlim_max=1048596}) = -1 EPERM (Operation not permitted)

On 2.6.7-bk20 host mysqld works when --disable-largefile:
getrlimit(RLIMIT_NOFILE, {rlim_cur=1034000, rlim_max=1034000}) = 0
setrlimit(RLIMIT_NOFILE, {rlim_cur=33554452, rlim_max=33554452}) = -1 EPERM (Operation not permitted)

Compare with standard-4.0.18 official binaries (I believe with largefile enabled) on the 2.6.7-bk20 host:
getrlimit(RLIMIT_NOFILE, {rlim_cur=1034000, rlim_max=1034000}) = 0
setrlimit(RLIMIT_NOFILE, {rlim_cur=33554452, rlim_max=33554452}) = -1 EPERM (Operation not permitted)
time(NULL)                              = 1089936971
write(2, "040716  2:16:11  ", 17040716  2:16:11  )       = 17
write(2, "Warning: setrlimit couldn\'t incr"..., 98Warning: setrlimit couldn't increase number of open files to more than 1034000 (request: 33554452)) = 98

Suggested fix:
1. first of all, mysqld_safe should use sysctl(1) if available

2. someone should figure out from where comes the 1048596 number:
# /usr/sbin/mysqld || echo "died on 2.4.26-pre3 host"
040716  2:14:03  Warning: setrlimit couldn't increase number of open files to more than 1034000 (request: 1048596)
died on 2.4.26-pre3 host

/etc/my.cnf specifies 30000 files max. I think it's some buffer overflow of wrong int size or similar.
[23 Aug 2004 19:14] Hartmut Holzgraefe
The value is calculated like this:

    uint wanted_files=10+(uint) max(max_connections*5,
                                    max_connections+table_cache_size*2);
    set_if_bigger(wanted_files, open_files_limit);

so i can only imagine that you have set either max_connections
or table_cache_size to ridiculos high values?
[14 Feb 2005 22:54] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".