Description:
Starting mysqlfailover utility as a daemon and it monitor for some time and stops working after some time. However, I see the linux level process was still active including the pid file available.
Daemon started at Apr 29 21:41 hour:
=====================================
mysqlfailover --verbose --master=root@10.52.201.11 --discover-slaves-login=root --failover-mode=auto --candidates=root@10.52.201.101,root@10.206.75.34 --daemon=start --log=mysqlfailover.log --pidfile=failover_daemon.pid
Stopped reporting master/slave health status Apr 29 22:08 hour:
============================================================
mysqlfailoverconsole:/home/cloud # tail -f mysqlfailover.log
2015-04-29 22:07:56 PM INFO host: alphadbmysql3.amers1.cis.trcloud, port: 3306, role: SLAVE, state: UP, gtid_mode: ON, health: OK, version: 5.6.23-log, master_log_file: mysql-bin.000001, master_log_pos: 3429706, IO_Thread: Yes, SQL_Thread: Yes, Secs_Behind: 0, Remaining_Delay: No, IO_Error_Num: 0, IO_Error: , SQL_Error_Num: 0, SQL_Error: , Trans_Behind: 0
2015-04-29 22:07:56 PM INFO host: alphadbmysql4.apac1.cis.trcloud, port: 3306, role: SLAVE, state: UP, gtid_mode: ON, health: OK, version: 5.6.23-log, master_log_file: mysql-bin.000001, master_log_pos: 3429706, IO_Thread: Yes, SQL_Thread: Yes, Secs_Behind: 0, Remaining_Delay: No, IO_Error_Num: 0, IO_Error: , SQL_Error_Num: 0, SQL_Error: , Trans_Behind: 0
2015-04-29 22:08:16 PM INFO Discovering slaves for master at 10.52.201.101:3306
2015-04-29 22:08:22 PM INFO Discovering slave at alphadbmysql2.emea1.cis.trcloud:3306
2015-04-29 22:08:22 PM INFO Discovering slave at alphadbmysql3.amers1.cis.trcloud:3306
2015-04-29 22:08:22 PM INFO Discovering slave at alphadbmysql4.apac1.cis.trcloud:3306
2015-04-29 22:08:26 PM INFO Master Information
2015-04-29 22:08:26 PM INFO Binary Log File: mysql-bin.000001, Position: 3429706, Binlog_Do_DB: N/A, Binlog_Ignore_DB: N/A
2015-04-29 22:08:26 PM INFO GTID Executed Set: 3424e4fa-dfc4-11e4-bb55-fa163e09f230:1[...]
2015-04-29 22:08:27 PM INFO Getting health for master: 10.52.201.101:3306.
Linux level process are still up and running:
===============================================
mysqlfailoverconsole:/home/cloud # ps -ef | grep fail
root 12241 1 0 Apr29 ? 00:00:05 /usr/bin/python /usr/bin/mysqlfailover --master=root@10.52.201.101 --discover-slaves-login=root auto --verbose --candidates=root@10.52.201.11,root@10.206.75.34 --force --log=mysqlfailover.log --daemon=start
root 20070 19979 0 15:36 pts/1 00:00:00 grep fail
PID file also still there:
===============================================
mysqlfailoverconsole:/home/cloud # cat failover_daemon.pid
12241
When I bring down the master, mysqlfailover is not detecting the failure and no activities from mysqlfailover utility.
Master and slave are running with mysql Ver 14.14 Distrib 5.6.23, for Linux (x86_64) using EditLine wrapper
How to repeat:
same as above