| Bug #4872 | Can't start mysqld as root when using multiple nss sources | ||
|---|---|---|---|
| Submitted: | 3 Aug 2004 14:53 | Modified: | 25 Feb 2005 19:31 |
| Reporter: | Alexandre Boeglin | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server | Severity: | S2 (Serious) |
| Version: | 4.0.18, verified on 4.1.5 | OS: | Linux (Linux (or actually glibc?)) |
| Assigned to: | Jim Winstead | CPU Architecture: | Any |
[20 Aug 2004 12:20]
Hartmut Holzgraefe
I couldn't reproduce the problem, but my setup is slightly different. I tested 4.0.20 on SuSE 9.0 with nsswitch settings passwd: files nis shadow: files nis group: files nis Could you try 4.0.20 to see if the problem persists?
[20 Aug 2004 14:45]
Alexandre Boeglin
Okay, I tried with Mandrake Cooker packages : MySQL-common-4.0.20-3mdk.i586.rpm, MySQL-Max-4.0.20-3mdk.i586.rpm. I got exactly the same result. btw, I don't think your nss setup will be taken in account if you don't have nis specific fields in /etc/passwd and /etc/group. So, to reproduce this bug, maybe you'll have to set up a real ldap or nis server. Regards, Alex
[20 Aug 2004 17:10]
Hartmut Holzgraefe
Can you add your complete strace log so that i can compare it to mine? Btw: the geteuid32() call happens *before* setuid(...) in my strace, so it is expected to return 0
[24 Aug 2004 14:35]
Alexandre Boeglin
strace output of the command "strace mysqld-max -u mysql"
Attachment: strace.log.bz2 (application/x-bzip, text), 4.28 KiB.
[24 Aug 2004 14:36]
Alexandre Boeglin
Okay, I attached a file containing the strace log. Regards, Alex.
[28 Sep 2004 10:04]
Hartmut Holzgraefe
Ok, i have verified the crash on some of my local server installations now. A running LDAP server is not needed to reproduce the bug, those server binaries that show the problem crash as soon as LDAP is added to the NSS configuration. I have no idea yet what difference between my installations causes only some of them to crash. I especially haven't been able to create a debug binary that crashes yet so i can't yet single step through the code to identify the actual cause of the crash. I'm trying out different combinations of configure options right now to create a debugable binary that actually crashes. As soon as i have succeeded in that it should be easy to identify and fix the cause of the crash.
[17 Oct 2004 22:42]
Hartmut Holzgraefe
i have now been able to produce a crashing debug binary using the following configure line on freshly unpacked 4.1.5 source: CC='gcc' CFLAGS='-O1 -g' CXX='gcc' CXXFLAGS='-O1 -g' LDFLAGS='' ASFLAGS='' ./configure '--prefix=/usr/local/mysql' '--enable-assembler' '--with-extra-charsets=complex' '--enable-thread-safe-client' '--with-readline' '--enable-local-infile' --with-debug '--with-mysqld-ldflags=-all-static' '--with-client-ldflags=-all-static' all my recent test builds crashed when configured as static binaries with the 'all-static' ldflag options but i'm pretty sure i had dynamic libraries crashing in the past, too (sorry, i lost the logs i took for these older builds) the actual crash happens when set_user() in sql/mysqld.cc calls the libc function initgroups(). the parameters passed to initgroups() look perfectly valid. so the actual problem is either within glibc, libnss_ldap, libldap or (IMHO most likely) related to nss shared library handling gdb isn't able to create a backtrace after the crash
[17 Oct 2004 22:52]
Hartmut Holzgraefe
some google results regarding "nss initgroups segfault" this one seconds my theory regarding static builds http://lists.gnu.org/archive/html/bug-parted/2001-08/msg00116.html Zope seems to be suffering from this, too http://gossamer-threads.com/lists/zope/users/173947?search_string=initgroups;#173947 http://mail.zope.org/pipermail/zope-collector-monitor/2004-August/003985.html a similar problem with µlibc, maybe the same code is in glibc? http://www.uclibc.org/lists/uclibc/2002-July/003998.html The glibc bugzilla didn't show any entries when searching for initgroups.
[17 Oct 2004 23:08]
Hartmut Holzgraefe
i've now tried the other nss method available on my system and none of these crashed so it seems to be an LDAP only problem. As a first workaround i would suggest to temporarily change the segfault signal handler when calling initgroups() so that we can at least bail out with a meaningfull error message in this case that recommends to either drop ldap from /etc/nsswitch.conf or to start as user 'mysql' instead of root right away. Btw: Our x86_64 startup crash problems might be related to this, too. Having a real error message for this would help to verify this, too.
[18 Oct 2004 10:27]
Ingo Strüwing
Some time ago I found a similar problem on Debian. It turned out that an entry of 'db' in /etc/nsswitch.conf activates a nss library with Sleepycats BerkeleyDB in it, which is also contained in MySQL Max. These versions of BerkeleyDB in the same executable disturbed each other. But I do not see an exact match in this case, as it happens only with 'ldap' and not with 'db'. It would be nice to have a stack backtrace. This might inspire an idea for what's going on.
[24 Nov 2004 17:23]
Lenz Grimmer
BUG#3037 was marked as a duplicate of this bug.
[16 Feb 2005 20:07]
Jim Winstead
This is a bug in glibc's NSS support when linked statically. This can be avoid by not linking statically, not having LDAP in nsswitch.conf, or using a newer glibc with nscd. The patch outputs a message to this effect when a segfault occurs during the call to initgroups().
[23 Feb 2005 0:51]
Jim Winstead
Fix pushed, will be in 4.1.11.
[25 Feb 2005 19:31]
Paul DuBois
Noted in 4.1.11 changelog.
[1 Dec 2005 11:47]
Bernardo Innocenti
4.0.26 still seems to be affected. Is it possible to backport the fix to the 4.0 branch?

Description: System is an updated (2004-08-03) Mandrake 10.0. [alex@dls alex]$ uname -a Linux dls.nexedi.org 2.6.3-15mdk #1 Fri Jul 2 22:09:29 MDT 2004 i686 unknown unknown GNU/Linux [alex@dls alex]$ rpm -q MySQL-Max MySQL-Max-4.0.18-1.1.100mdk [alex@dls alex]$ ldd /usr/sbin/mysqld-max linux-gate.so.1 => (0xffffe000) librt.so.1 => /lib/tls/librt.so.1 (0x4002b000) libdl.so.2 => /lib/libdl.so.2 (0x40040000) libssl.so.0.9.7 => /usr/lib/libssl.so.0.9.7 (0x40043000) libcrypto.so.0.9.7 => /usr/lib/libcrypto.so.0.9.7 (0x40075000) libz.so.1 => /lib/libz.so.1 (0x40177000) libcrypt.so.1 => /lib/libcrypt.so.1 (0x40188000) libnsl.so.1 => /lib/libnsl.so.1 (0x401b5000) libpthread.so.0 => /lib/tls/libpthread.so.0 (0x401c9000) libstdc++.so.5 => /usr/lib/libstdc++.so.5 (0x401d9000) libm.so.6 => /lib/tls/libm.so.6 (0x40299000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x402bc000) libc.so.6 => /lib/tls/libc.so.6 (0x402c5000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) Here is an extract from my /etc/nsswitch.conf : passwd: files ldap shadow: files ldap group: files ldap I have system users (including mysql) in /etc/passwd and normal users in a working ldap directory. a "getent passwd" gives me both system and normal users, and every other part of my system works like a charm, so I assume it's a bug in mysql. when starting mysqld as root, and watching oit with strace, I can clearly see it looking for nss data, first in files, then in ldap. But then, mysqld does a geteuid32() which returns 0 instead of the uid of the mysql user, and it exits with a signal 11. when starting mysqld as root with nss_ldap disabled, or when starting it logged in as mysql, there is no problem. Of course, I'm available for further infos or tests. How to repeat: start mysqld as root : # strace /usr/sbin/mysqld-max -u mysql [...] (mainly opening libraries, and looking for nss infos) getpid() = 24651 geteuid32() = 0 --- SIGSEGV (Segmentation fault) @ 0 (0) --- write(2, "mysqld got signal 11;\nThis could"..., 248mysqld got signal 11; [...] (displaying crash message, and exiting) Suggested fix: for the moment, i added a "sudo -u mysql" in my init script on the line that launches mysqld_safe, but the nss code needs to be fixed.