MySQL Bugs: #91562: using --daemon-stop to stop mysqlfailover cause slave sql

Bug #91562	using --daemon-stop to stop mysqlfailover cause slave sql_thread down
Submitted:	6 Jul 2018 7:51	Modified:	9 Jul 2018 10:28
Reporter:	Bingwei Wang	Email Updates:
Status:	Duplicate	Impact on me:	None
Category:	MySQL Utilities	Severity:	S1 (Critical)
Version:	1.6.5	OS:	CentOS (7)
Assigned to:		CPU Architecture:	Any
Tags:	mysqlfailover

Description:
start mysqlfailover using daemon.

mysqlfailover --master=fo --discover-slaves-login=fo --log=/tmp/fo.log  --rpl-user=repl:repl --force --exec-after=/root/wbw/fo_after.sh  --daemon=start

After a while stop it using:
mysqlfailover --master=fo --discover-slaves-login=fo --log=/tmp/fo.log  --rpl-user=repl:repl --force --exec-after=/root/wbw/fo_after.sh  --daemon=stop

the sql_thread of the slave shows:

...
          Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1146
                   Last_Error: Error executing row event: 'Table 'mysql.failover_console' doesn't exist'
                 Skip_Counter: 0
    ...
    ...
               Last_SQL_Errno: 1146
               Last_SQL_Error: Error executing row event: 'Table 'mysql.failover_console' doesn't exist'
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 144

I checked the relay log and find a delete operation :
BINLOG '
7eg+WxOQAAAAQQAAAIgBAAAAAGwAAAAAAAEABW15c3FsABBmYWlsb3Zlcl9jb25zb2xlAAL+/gTe
/f4eA8UEJho=
7eg+WyCQAAAAOgAAAMIBAAAAAGwAAAAAAAEAAgAC//wPADE5Mi4xNjguMjE2LjE0NAQzMzA2EyTN
fg==
'/*!*/;
### DELETE FROM `mysql`.`failover_console`
### WHERE
###   @1='192.168.216.144' /* STRING(765) meta=57085 nullable=1 is_null=0 */
###   @2='3306' /* STRING(30) meta=65054 nullable=1 is_null=0 */
# at 617
#180706 11:58:37 server id 144  end_log_pos 481 CRC32 0xc4c3a0bf 	Xid = 722
COMMIT/*!*/;

However failover_console does no longer existed on the slave!

The mysqlfailover log shows:

2018-07-06 15:11:03 PM INFO Server '192.168.216.145:3306' is using MySQL version 5.7.22-log.
2018-07-06 15:11:03 PM INFO Discovering slave at 192.168.216.147:3306
2018-07-06 15:11:03 PM INFO Found slave: 192.168.216.147:3306
2018-07-06 15:11:03 PM INFO Server '192.168.216.147:3306' is using MySQL version 5.7.22-log.
2018-07-06 15:11:03 PM INFO Unregistering instance on master.
2018-07-06 15:11:03 PM INFO Master entry in the failover_console table was deleted.

I also checked the source code in failover_daemon.py:
def cleanup(self):
         """Controlled cleanup for the daemon.
 
         It will unregister the failover_console table.
         """
         # if master is not alive, try connecting to it
         if not self.master.is_alive():
             self.master.connect()
         try:
             self.master.exec_query(_DELETE_FC_TABLE.format(self.master.host,
                                                            self.master.port))
             self._report("Master entry in the failover_console"
                          " table was deleted.", logging.INFO, False)
         except:
             pass

The cleanup function didn't turn off binary log before executing deleting operation.
I think that is the problem.

How to repeat:
Do as description.

Suggested fix:
Turn off binary log before executing the deleting statement in cleanup function. 
Actually, in the unregister_slaves function, the FC table is dropped on slaves.cleaning up master should also use 'drop if exist' statement instead of deleting.

Hello Bingwei Wang,

Thank you for the report.
This is most likely duplicate of Bug #85254, please see Bug #85254

Thanks,
Umesh