MySQL Bugs: #11590: mysql.server desn't react immidiately to mysqld

Bug #11590	mysql.server desn't react immidiately to mysqld_safe failed start
Submitted:	27 Jun 2005 12:49	Modified:	22 Sep 2005 22:41
Reporter:	Victoria Reznichenko	Email Updates:
Status:	Won't fix	Impact on me:	None
Category:	MySQL Server	Severity:	S3 (Non-critical)
Version:	4.1	OS:	Linux (linux)
Assigned to:	Timothy Smith	CPU Architecture:	Any

Description:
If you start MySQL server with mysql.server script and mysqld_safe doesn't start it writes immediately error message to the error log, but mysql.server script still writes dots and then fails:

Starting MySQL................................... failed

How to repeat:
1. put any dummy option to my.cnf file
2. start MySQL server with mysql.server script

that's because the only way that mysql.server knows that startup has failed is by timing out while waiting for the pid file to be created.

Jim, let me explain.

The problem isn't only in mysql.server script. There is no way to know if mysqld starts successfully or fails because mysqld_safe always exits with status 0. So this should be fixed in mysqld_safe too then the init script can see error and exit.

I don't understand -- mysql.server says it failed. What else should it do? It can't notice immediately when mysqld_safe exits. It doesn't matter what mysqld_safe returns, because mysql.server can't check its return value.

Hi.  I'm setting this to "Won't fix", because I agree that it's less-than-desirable behavior, but I can't find a way around it in a shell script.

The crux of the problem is that mysqld_safe must be run as a background job, like:

mysqld_safe --options &

How does the calling script (mysql.server) find out if mysqld_safe returned an error or not?  It can use wait:

mysqld_safe --options &
the_pid=$!

...

wait $the_pid
mysqld_safe_error=$?

test $mysqld_safe_error -eq 0 || echo "error..."

But that will wait for the whole job to complete - that's no good.

I tried several ways of getting around this, but was unable to find a combination that works properly.  Here is my current test program, in case it sparks any ideas.  It seems that there should be some combination of wait, sleep, trap and kill which can get this job done, but I am unable to find it.

#! /bin/sh

mode=$1; test $# -gt 0 && shift

pid_file=PID
test -f $pid_file && rm $pid_file

case $mode in
run-break-short )
    echo "This program is broken!"
    exit 1
    ;;

run-break-long )
    echo "This program will be broken!"
    sleep 10
    exit 1
    ;;

run-ok-short )
    echo "This program is good!"
    ps uwwp $$ > $pid_file
    exit 0
    ;;

run-ok-long )
    echo "This program will be good!"
    sleep 10
    ps uwwp $$ > $pid_file
    exit 0
    ;;

'' )
    my_pid=$$
    echo "Top PID is $$\n"

    wait_timed_out=0

    (
        #
        # Here is where the main program is run
        # Think of this as running mysqld...
        #
        sh $0 run-ok-long &
        main_job_pid=$!

        echo "Shell is for $main_job_pid ..."
        wait $main_job_pid
        main_error_code=$?

        echo "Main job is done, with error code $main_error_code"

        exit $main_error_code
    ) &
    wait_job_pid=$!

    # Idea for "sleep pipeline" taken from:
    # http://www.cit.gu.edu.au/~anthony/info/shell/script.hints
#   sleep 2 | (
#       kill -ALRM $wait_job_pid
#       echo "sent signal to pid $wait_job_pid"
#   ) &
#   kill_job_pid=$!

    echo "Waiting for wait job $wait_job_pid"
    wait $wait_job_pid > /dev/null 2>&1
    sub_error_code=$?

    if [ $wait_timed_out -eq 0 ]; then
        echo "I waited for $main_job_pid, via $wait_job_pid" \
                ", and it returned $sub_error_code"
    else
        echo "I stopped waiting for $main_job_pid, via $wait_job_pid" \
                " (I guess $sub_error_code is useless)"
    fi
    ;;

* )
    echo "Usage: $0 <see source>"
    exit 1
    ;;
esac

Regards,

Timothy