Bug #76905 mysqlfailover stop monitoring master/slave host for any failure
Submitted: 30 Apr 2015 15:46 Modified: 23 Jun 2015 15:20
Reporter: Ravi Subramaniyan Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Utilities Severity:S1 (Critical)
Version:Utilities mysqlfailover version 1.5.4 OS:Linux (mysqlfailoverconsole:/home/cloud # uname -a Linux mysqlfailoverconsole 2.6.32-431.5.1.el6.x86_64 #1 )
Assigned to: CPU Architecture:Any
Tags: mysqlfailover

[30 Apr 2015 15:46] Ravi Subramaniyan
Description:
Starting  mysqlfailover utility as a daemon and it monitor for some time and stops working after some time. However, I see the linux level process was still active including the pid file available. 

Daemon started at Apr 29 21:41 hour:
=====================================
mysqlfailover --verbose --master=root@10.52.201.11 --discover-slaves-login=root --failover-mode=auto  --candidates=root@10.52.201.101,root@10.206.75.34 --daemon=start --log=mysqlfailover.log --pidfile=failover_daemon.pid

Stopped reporting master/slave health status Apr 29 22:08 hour:
============================================================
mysqlfailoverconsole:/home/cloud # tail -f mysqlfailover.log
2015-04-29 22:07:56 PM INFO host: alphadbmysql3.amers1.cis.trcloud, port: 3306, role: SLAVE, state: UP, gtid_mode: ON, health: OK, version: 5.6.23-log, master_log_file: mysql-bin.000001, master_log_pos: 3429706, IO_Thread: Yes, SQL_Thread: Yes, Secs_Behind: 0, Remaining_Delay: No, IO_Error_Num: 0, IO_Error: , SQL_Error_Num: 0, SQL_Error: , Trans_Behind: 0
2015-04-29 22:07:56 PM INFO host: alphadbmysql4.apac1.cis.trcloud, port: 3306, role: SLAVE, state: UP, gtid_mode: ON, health: OK, version: 5.6.23-log, master_log_file: mysql-bin.000001, master_log_pos: 3429706, IO_Thread: Yes, SQL_Thread: Yes, Secs_Behind: 0, Remaining_Delay: No, IO_Error_Num: 0, IO_Error: , SQL_Error_Num: 0, SQL_Error: , Trans_Behind: 0
2015-04-29 22:08:16 PM INFO Discovering slaves for master at 10.52.201.101:3306
2015-04-29 22:08:22 PM INFO Discovering slave at alphadbmysql2.emea1.cis.trcloud:3306
2015-04-29 22:08:22 PM INFO Discovering slave at alphadbmysql3.amers1.cis.trcloud:3306
2015-04-29 22:08:22 PM INFO Discovering slave at alphadbmysql4.apac1.cis.trcloud:3306
2015-04-29 22:08:26 PM INFO Master Information
2015-04-29 22:08:26 PM INFO Binary Log File: mysql-bin.000001, Position: 3429706, Binlog_Do_DB: N/A, Binlog_Ignore_DB: N/A
2015-04-29 22:08:26 PM INFO GTID Executed Set: 3424e4fa-dfc4-11e4-bb55-fa163e09f230:1[...]
2015-04-29 22:08:27 PM INFO Getting health for master: 10.52.201.101:3306.

Linux level process are still up and running:
===============================================

mysqlfailoverconsole:/home/cloud # ps -ef | grep fail
root     12241     1  0 Apr29 ?        00:00:05 /usr/bin/python /usr/bin/mysqlfailover --master=root@10.52.201.101 --discover-slaves-login=root auto --verbose --candidates=root@10.52.201.11,root@10.206.75.34 --force --log=mysqlfailover.log --daemon=start
root     20070 19979  0 15:36 pts/1    00:00:00 grep fail

PID file also still there:
===============================================
mysqlfailoverconsole:/home/cloud # cat failover_daemon.pid
12241

When I bring down the master, mysqlfailover is not detecting the failure and no activities from mysqlfailover utility.  

Master and slave are running with mysql  Ver 14.14 Distrib 5.6.23, for Linux (x86_64) using  EditLine wrapper

How to repeat:
same as above
[13 May 2015 14:46] Ravi Subramaniyan
Hi Guys,
       Any help on this issue? thanks.

Regards,
Ravi