MySQL Bugs: #79613: /usr/bin/mysql-systemd-start does not return control to command line

Bug #79613	/usr/bin/mysql-systemd-start does not return control to command line
Submitted:	12 Dec 2015 5:04	Modified:	17 Dec 2015 14:07
Reporter:	7 reeds	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Packaging	Severity:	S2 (Serious)
Version:	5.6.28-2.el7	OS:	CentOS (Centos 7)
Assigned to:	Balasubramanian Kandasamy	CPU Architecture:	Any

Description:
When using systemd to start mysql I find that the DB server is running but the start and restart scripts hang.  My investigation shows that systemd calls: 

    /etc/systemd/system/mysql.service

which seems to set a timeout (that is not firing) and calls the script:

    /usr/bin/mysql-systemd-start

Which calls mysqladmin to "ping" the server.

On my Centos 7 machine the "ping" line looks like:

    mysqladmin --no-defaults --socket="$adminsocket" --user=UNKNOWN_MYSQL_US
ER ping >/dev/null 2>&1 && break

Being a good citizen I removed all bogus users from MySQL.  There is no user named "UNKNOWN_MYSQL_USER" so mysqladmin fails.  This mysqladmin call is in an infinite loop in its script and the "timer" in the first script never seems to fire.

How to repeat:
Start or restart MySQL on a Centos 7 vm.

Suggested fix:
Good question.  I am playing with removing the "--user" arg and putting a "[mysqladmin]" group in the my.cnf file.  That is not working at this instant.

Hi,

thanks for your report!

Your findings are correct. The mysqld service pings MySQL server until a 
answer is given. The service is not marked ready before the ping signal 
success.

This way other services which depend on MySQL service wait to start
to MySQL is ready for connections.

mysqld process can be running and still refuse connections, this happens, for example, when crash recovery is needed.

You can verify in error log: /var/log/mysqld.log that everything is ok or not.
Please have a look in that file.

This ping trick is verified to work in most cases, can you provide contents of config file /etc/my.cnf and describe other special things you have done?

Note:
In MySQL 5.7 we have replaced this hack with more native systemd support and the
ping loop has been removed there.

Hi,

The my.cnf file in question looks like:

[client]
password='a password'
user=root

[mysqladmin]
password='a password'
user=root

Again, MySQL stops and "starts" just fine but the official commands to do the start action do not return control to the command line.  This impacts "yum" when it applies updates, for example.  Yum will download and install updates but during the "cleanup" phase it obviously tries to restart the DB service.  Because the script mentioned does not return control to the command line yum hangs.

AH! I just figured something out.  The mysqladmin line in the script is:

    mysqladmin --no-defaults --socket="$adminsocket" --user=UNKNOWN_MYSQL_USER ping

Using "--no-defaults" prevents mysqladmin from picking up the my.cnf file(s) so it is "defaulting" to user "root" and assuming no password -- or at least not finding a password -- and our root user has a password.

I think I will remove the "--no-defaults" and "--user" args and see what happens

... and another thing...

in the script /usr/bin/mysql-systemd-start on the "ping" line there is the variable: $adminsocket

On my centos 7 machine I can put an echo of that variable on the line above the ping and the content of that variable is:

...
echo "adminsocket=$adminsocket"
...

adminsocket=/var/lib/mysql/mysql.sock /var/lib/mysql/mysql.sock

Somehow the socket path is being doubled up in the get_options function at the top of the same script.  That might have been the issue all along.  Mysqladmin complains that there is no such argument as /var/lib/mysql/mysql.sock

... sorry, I'm being terribly chatty... 

In my version of the /usr/bin/mysql-systemd-start in the get_option function there is the line:

    ret=$(/usr/bin/my_print_defaults $section | grep '^--'${option}'=' | cut -d= -f2-)

On my machine calling get_option with the "option" of 'datadir' returns two values that are identical, a la:

$ /usr/bin/my_print_defaults mysqld datadir "/var/lib/mysql"
--datadir=/var/lib/mysql
--socket=/var/lib/mysql/mysql.sock
--symbolic-links=0
--sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
--datadir=/var/lib/mysql
--socket=/var/lib/mysql/mysql.sock
--symbolic-links=0
--sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
--explicit_defaults_for_timestamp=OFF
--port=3306

I propose a change to the "ret=" line above that ads "sort | uniq | " just before the grep.  Making get_option look like:

get_option () {
    local section=$1
    local option=$2
    local default=$3
    ret=$(/usr/bin/my_print_defaults $section | sort | uniq | grep '^--'${option}'=' | cut -d= -f2-)
    [ -z "$ret" ] && ret=$default
    echo $ret
}

And setting everything else back to the way I found it seems to work for me.

Hi again,

thanks for your nice analysis!

We had a similar issue in SysV initscript in bug #77878.

I think a similar change will be used to fix this bug too.

Thanks for the bug report.

If the configuration files contains multiple datadir lines, use the last
datadir entry in the RPM installation scripts.

Noted in 5.5.48, 5.6.29, 5.7.11, 5.8.0 changelogs.

Using systemd to start mysqld failed if configuration files contained
multiple datadir lines. Now the last datadir line is used.