Bug #4872 | Can't start mysqld as root when using multiple nss sources | ||
---|---|---|---|
Submitted: | 3 Aug 2004 14:53 | Modified: | 25 Feb 2005 19:31 |
Reporter: | Alexandre Boeglin | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S2 (Serious) |
Version: | 4.0.18, verified on 4.1.5 | OS: | Linux (Linux (or actually glibc?)) |
Assigned to: | Jim Winstead | CPU Architecture: | Any |
[3 Aug 2004 14:53]
Alexandre Boeglin
[20 Aug 2004 12:20]
Hartmut Holzgraefe
I couldn't reproduce the problem, but my setup is slightly different. I tested 4.0.20 on SuSE 9.0 with nsswitch settings passwd: files nis shadow: files nis group: files nis Could you try 4.0.20 to see if the problem persists?
[20 Aug 2004 14:45]
Alexandre Boeglin
Okay, I tried with Mandrake Cooker packages : MySQL-common-4.0.20-3mdk.i586.rpm, MySQL-Max-4.0.20-3mdk.i586.rpm. I got exactly the same result. btw, I don't think your nss setup will be taken in account if you don't have nis specific fields in /etc/passwd and /etc/group. So, to reproduce this bug, maybe you'll have to set up a real ldap or nis server. Regards, Alex
[20 Aug 2004 17:10]
Hartmut Holzgraefe
Can you add your complete strace log so that i can compare it to mine? Btw: the geteuid32() call happens *before* setuid(...) in my strace, so it is expected to return 0
[24 Aug 2004 14:35]
Alexandre Boeglin
strace output of the command "strace mysqld-max -u mysql"
Attachment: strace.log.bz2 (application/x-bzip, text), 4.28 KiB.
[24 Aug 2004 14:36]
Alexandre Boeglin
Okay, I attached a file containing the strace log. Regards, Alex.
[28 Sep 2004 10:04]
Hartmut Holzgraefe
Ok, i have verified the crash on some of my local server installations now. A running LDAP server is not needed to reproduce the bug, those server binaries that show the problem crash as soon as LDAP is added to the NSS configuration. I have no idea yet what difference between my installations causes only some of them to crash. I especially haven't been able to create a debug binary that crashes yet so i can't yet single step through the code to identify the actual cause of the crash. I'm trying out different combinations of configure options right now to create a debugable binary that actually crashes. As soon as i have succeeded in that it should be easy to identify and fix the cause of the crash.
[17 Oct 2004 22:42]
Hartmut Holzgraefe
i have now been able to produce a crashing debug binary using the following configure line on freshly unpacked 4.1.5 source: CC='gcc' CFLAGS='-O1 -g' CXX='gcc' CXXFLAGS='-O1 -g' LDFLAGS='' ASFLAGS='' ./configure '--prefix=/usr/local/mysql' '--enable-assembler' '--with-extra-charsets=complex' '--enable-thread-safe-client' '--with-readline' '--enable-local-infile' --with-debug '--with-mysqld-ldflags=-all-static' '--with-client-ldflags=-all-static' all my recent test builds crashed when configured as static binaries with the 'all-static' ldflag options but i'm pretty sure i had dynamic libraries crashing in the past, too (sorry, i lost the logs i took for these older builds) the actual crash happens when set_user() in sql/mysqld.cc calls the libc function initgroups(). the parameters passed to initgroups() look perfectly valid. so the actual problem is either within glibc, libnss_ldap, libldap or (IMHO most likely) related to nss shared library handling gdb isn't able to create a backtrace after the crash
[17 Oct 2004 22:52]
Hartmut Holzgraefe
some google results regarding "nss initgroups segfault" this one seconds my theory regarding static builds http://lists.gnu.org/archive/html/bug-parted/2001-08/msg00116.html Zope seems to be suffering from this, too http://gossamer-threads.com/lists/zope/users/173947?search_string=initgroups;#173947 http://mail.zope.org/pipermail/zope-collector-monitor/2004-August/003985.html a similar problem with µlibc, maybe the same code is in glibc? http://www.uclibc.org/lists/uclibc/2002-July/003998.html The glibc bugzilla didn't show any entries when searching for initgroups.
[17 Oct 2004 23:08]
Hartmut Holzgraefe
i've now tried the other nss method available on my system and none of these crashed so it seems to be an LDAP only problem. As a first workaround i would suggest to temporarily change the segfault signal handler when calling initgroups() so that we can at least bail out with a meaningfull error message in this case that recommends to either drop ldap from /etc/nsswitch.conf or to start as user 'mysql' instead of root right away. Btw: Our x86_64 startup crash problems might be related to this, too. Having a real error message for this would help to verify this, too.
[18 Oct 2004 10:27]
Ingo Strüwing
Some time ago I found a similar problem on Debian. It turned out that an entry of 'db' in /etc/nsswitch.conf activates a nss library with Sleepycats BerkeleyDB in it, which is also contained in MySQL Max. These versions of BerkeleyDB in the same executable disturbed each other. But I do not see an exact match in this case, as it happens only with 'ldap' and not with 'db'. It would be nice to have a stack backtrace. This might inspire an idea for what's going on.
[24 Nov 2004 17:23]
Lenz Grimmer
BUG#3037 was marked as a duplicate of this bug.
[16 Feb 2005 20:07]
Jim Winstead
This is a bug in glibc's NSS support when linked statically. This can be avoid by not linking statically, not having LDAP in nsswitch.conf, or using a newer glibc with nscd. The patch outputs a message to this effect when a segfault occurs during the call to initgroups().
[23 Feb 2005 0:51]
Jim Winstead
Fix pushed, will be in 4.1.11.
[25 Feb 2005 19:31]
Paul DuBois
Noted in 4.1.11 changelog.
[1 Dec 2005 11:47]
Bernardo Innocenti
4.0.26 still seems to be affected. Is it possible to backport the fix to the 4.0 branch?