Bug #27056 mysql_install_db fails to initialize the 'mysql' db tables
Submitted: 12 Mar 2007 17:44 Modified: 2 Apr 2007 15:05
Reporter: George Magklaras Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: Installing Severity:S2 (Serious)
Version:5.0.37, 5.0.36, 5.0-bk OS:Linux (Linux 2.6 x86_64)
Assigned to: Daniel Fischer CPU Architecture:Any
Tags: mysql_install

[12 Mar 2007 17:44] George Magklaras
Description:
After compiling cleanly (no errors and all binaries and libs in place) from the source distribution in an environment as described below (see ENV DETAILS section below) as user root, when I attempt to run (from the installation's bin dir) mysql_install_db --user=auser , to initialize the database, the script exits creating no tables under /var with the following error:

---------------------------
Installing all prepared tables
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=258048
max_used_connections=0
max_connections=100
threads_connected=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 92783 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
frame pointer is NULL, did you compile with
-fomit-frame-pointer? Aborting backtrace!
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.
Installation of system tables failed!

Examine the logs in /mn/biotin/storage/mysql/var for more information.
.....

------------------------------------

The same thing occurs if I attempt to su as 'auser' and run the script. Increasing the verbosity level of the mysql_install_db script, reveals that the process stops at this point:

biotin.uio.no# bin/mysql_install_db --verbose --user=auser
Installing all prepared tables
Preparing db table
Preparing host table
Preparing user table
Preparing func table
Preparing tables_priv table
Preparing columns_priv table
Preparing procs_priv table
Preparing help_topic table
Preparing help_category table
Preparing help_keyword table
Preparing help_relation table
Preparing time_zone_name table
Preparing time_zone table
Preparing time_zone_transition table
Preparing time_zone_transition_type table
Preparing time_zone_leap_second table
#### -> it stops here and sometimes it hangs

If I then fish for mysql processes on the same system, while the script hangs, I see:

auser    4836  0.0  0.1 32140 4832 pts/1    S+   17:27   0:00 /mn/biotin/storage/mysql/libexec/mysqld --bootstrap --skip-grant-tables --basedir=/mn/biotin/storage/mysql --datadir=/mn/biotin/storage/mysql/var --skip-innodb --skip-bdb --skip-ndbcluster --user=sabryr --max_allowed_packet=8M --net_buffer_length=16K
root      4837  0.0  0.1 32140 4832 pts/1    S+   17:27   0:00 /mn/biotin/storage/mysql/libexec/mysqld --bootstrap --skip-grant-tables --basedir=/mn/biotin/storage/mysql --datadir=/mn/biotin/storage/mysql/var --skip-innodb --skip-bdb --skip-ndbcluster --user=sabryr --max_allowed_packet=8M --net_buffer_length=16K
root      4840  0.0  0.1 32140 4832 pts/1    S+   17:27   0:00 /mn/biotin/storage/mysql/libexec/mysqld --bootstrap --skip-grant-tables --basedir=/mn/biotin/storage/mysql --datadir=/mn/biotin/storage/mysql/var --skip-innodb --skip-bdb --skip-ndbcluster --user=sabryr --max_allowed_packet=8M --net_buffer_length=16K

So, the script fails to complete the grant tables

-------------ENV DETAILS Section--------------------
>Release:       mysql-5.0.37 (Source distribution)

>C compiler:    gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)
>C++ compiler:  gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)
>Environment:
        <machine, os, target, libraries (multiple lines)>
System: Linux biotin.uio.no 2.6.9-42.0.10.ELsmp #1 SMP Fri Feb 16 17:13:42 EST 2007 x86_64 x86_64 x86_64 GNU/Linux
Architecture: x86_64

Some paths:  /usr/bin/perl /usr/bin/make /usr/bin/gmake /usr/bin/gcc /usr/bin/cc
GCC: Reading specs from /usr/lib/gcc/x86_64-redhat-linux/3.4.6/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-java-awt=gtk --host=x86_64-redhat-linux
Thread model: posix
gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)
Compilation info: CC='gcc'  CFLAGS='-O3'  CXX='gcc'  CXXFLAGS='-O3 -felide-constructors -fno-exceptions -fno-rtti'  LDFLAGS='' ASFLAGS=''
LIBC:
lrwxrwxrwx  1 root root 13 Aug 21  2006 /lib/libc.so.6 -> libc-2.3.4.so
-rwxr-xr-x  1 root root 1442931 Jul  4  2006 /lib/libc-2.3.4.so
-rw-r--r--  1 root root 2418632 Jul  4  2006 /usr/lib/libc.a
-rw-r--r--  1 root root 204 Jul  4  2006 /usr/lib/libc.so
-rwxr-xr-x  1 root root 767416 Nov 28  2005 /usr/lib/libc-client.so.0
Configure command: ./configure '--prefix=/mn/biotin/storage/mysql' '--enable-assembler' '--with-mysqld-ldflags=-all-static' 'CFLAGS=-O3' 'CXXFLAGS=-O3 -felide-constructors -fno-exceptions -fno-rtti' 'CXX=gcc'

How to repeat:
1)Get a fully up-to-date RHEL 4 WS x86_64 box .

2)Download the 5.0.37 source tarball configure with the suggested options:
CFLAGS="-O3" CXX=gcc CXXFLAGS="-O3 -felide-constructors -fno-exceptions -fno-rtti" ./configure --prefix=/mn/panoptis/u1/georgios/mysql --enable-assembler --with-mysqld-ldflags=-all-static   #or whatever prefix

3)make and make install, ldconfig and set paths. 

4)Copy the my-medium.cnf to /etc/my.cnf, set the user and uncomment the innodb statements (see sample config my.cnf below).

5)Attempt to run as root mysql_install_db --verbose --user=auser and watch as the script fails to create the grant tables and it stops after preparing the ime_zone_leap_second table .  

------Sample my.cnf-----------
[client]
#password       = your_password
port            = 3306
socket          = /tmp/mysql.sock

# Here follows entries for some specific programs

# The MySQL server
[mysqld]
port            = 3306
socket          = /tmp/mysql.sock
skip-locking
user            = auser
key_buffer = 16M
max_allowed_packet = 1M
table_cache = 64
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M

# Don't listen on a TCP/IP port at all. This can be a security enhancement,
# if all processes that need to connect to mysqld run on the same host.
# All interaction with mysqld must be made via Unix sockets or named pipes.
# Note that using this option without enabling named pipes on Windows
# (via the "enable-named-pipe" option) will render mysqld useless!
#
#skip-networking

# Replication Master Server (default)
# binary logging is required for replication
log-bin=mysql-bin

# required unique id between 1 and 2^32 - 1
# defaults to 1 if master-host is not set
# but will not function as a master if omitted
server-id       = 1

# Replication Slave (comment out master section to use this)
#
# To configure this host as a replication slave, you can choose between
# two methods :
#
# 1) Use the CHANGE MASTER TO command (fully described in our manual) -
#    the syntax is:
#
#    CHANGE MASTER TO MASTER_HOST=<host>, MASTER_PORT=<port>,
#    MASTER_USER=<user>, MASTER_PASSWORD=<password> ;
#
#    where you replace <host>, <user>, <password> by quoted strings and
#    <port> by the master's port number (3306 by default).
#
#    Example:
#
#    CHANGE MASTER TO MASTER_HOST='125.564.12.1', MASTER_PORT=3306,
#    MASTER_USER='joe', MASTER_PASSWORD='secret';
#
# OR
#
# 2) Set the variables below. However, in case you choose this method, then
#    start replication for the first time (even unsuccessfully, for example
#    if you mistyped the password in master-password and the slave fails to
#    connect), the slave will create a master.info file, and any later
#    change in this file to the variables' values below will be ignored and
#    overridden by the content of the master.info file, unless you shutdown
#    the slave server, delete master.info and restart the slaver server.
#    For that reason, you may want to leave the lines below untouched
#    (commented) and instead use CHANGE MASTER TO (see above)
#
# required unique id between 2 and 2^32 - 1
# (and different from the master)
# defaults to 2 if master-host is set
# but will not function as a slave if omitted
#server-id       = 2
#
# The replication master for this slave - required
#master-host     =   <hostname>
#
# The username the slave will use for authentication when connecting
# to the master - required
#master-user     =   <username>
#
# The password the slave will authenticate with when connecting to
# the master - required
#master-password =   <password>
#
# The port the master is listening on.
# optional - defaults to 3306
#master-port     =  <port>
#
# binary logging - not required for slaves, but recommended
#log-bin=mysql-bin

# Point the following paths to different dedicated disks
#tmpdir         = /tmp/
#log-update     = /path-to-dedicated-directory/hostname

# Uncomment the following if you are using BDB tables
#bdb_cache_size = 4M
#bdb_max_lock = 10000

# Uncomment the following if you are using InnoDB tables
innodb_data_home_dir = /mn/panoptis/u1/georgios/mysql/var/
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /mn/panoptis/u1/georgios/mysql/var/
innodb_log_arch_dir = /mn/panoptis/u1/georgios/mysql/var/
# You can set .._buffer_pool_size up to 50 - 80 %
# of RAM but beware of setting memory usage too high
innodb_buffer_pool_size = 16M
innodb_additional_mem_pool_size = 2M
# Set .._log_file_size to 25 % of buffer pool size
innodb_log_file_size = 5M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50

[mysqldump]
quick
max_allowed_packet = 16M

[mysql]
no-auto-rehash
# Remove the next comment character if you are not familiar with SQL
#safe-updates

[isamchk]
key_buffer = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M

[myisamchk]
key_buffer = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M

[mysqlhotcopy]
interactive-timeout

Suggested fix:
No specific fix suggested but I find interesting that the script stops at preparing time zone data, especially when RedHat has recently patches the tzdata due to DST updates? Maybe, or maybe I am missing something else entirely.
[13 Mar 2007 10:00] Sveta Smirnova
Thank you for the report.

Verified as described using last community BK tree.

To repeat:
1. Compile with configure as described.
2. Run mysql_install_db with --user switch. Not needed to run it as root.

Workaround: compile using our BUILD/compile-pentium-debug-max script
[16 Mar 2007 16:49] Daniel Fischer
The "how to repeat" instructions can mostly be summarized by this:

1. build a statically linked mysqld by using configure option --with-mysqld-ldflags=-all-static
2. run mysqld --bootstrap --skip-innodb --user=a

I was able to reproduce this on RHEL4 and RHEL3 x86_64. It is not necessary to have the latest updates installed.

I was not able to reproduce this on any platform when nsswitch.conf did not contain additional sources to "files" for passwd data. On all platforms where I was able to reproduce this, "ldap" resulted in this crash and on some, "nis" did too.

I was unable to reproduce it at all on a different system running a much more current version of glibc. The current glibc version is 2.5; RHEL4 is using 2.3.4. I was also unable to reproduce it on different architectures than x86_64.

The cause of this server crash is with significant probability memory corruption resulting from dynamically loading certain name service switch libraries. Even when building a fully statically linked binary, glibc will dynamically load libnss* as required using dlopen. Glibc has had a history of not properly supporting dlopen called from statically linked code; on other platforms, this is not supported at all. If I'm not mistaken, with glibc support is explicitly restricted to libnss*. However it is possible that glibc 2.3.4 does still not correctly handle this case and using libc functions that use nss from statically linked code has other drawbacks (mostly, that this code may break on hosts using a different version of glibc, i.e. with incompatible libnss* shared objects).

As a conclusion, at this point I can offer the following workarounds:
- do not use statically linked mysqld
- do not use any feature that uses nss through getpwnam() and company, i.e. do not use the --user switch when intalling the database
[19 Mar 2007 17:14] George Magklaras
Thanks. Indeed a combination of dlopen (I am using NIS) and statically compiled libs reproduces the error and when you take the static lib config off there are no problems. 

Thanks for your time.
[2 Apr 2007 15:05] Daniel Fischer
Setting this bug to "won't fix" since this has to be addressed in glibc, not in MySQL.