Bug #84321 mysqlrpladmin switchover fails after check of unrelated replication user
Submitted: 22 Dec 2016 20:53 Modified: 4 Jan 2017 5:56
Reporter: monty solomon Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Utilities Severity:S1 (Critical)
Version:1.56, 1.64 OS:Any
Assigned to: CPU Architecture:Any

[22 Dec 2016 20:53] monty solomon
Description:
mysqlrpladmin fails to complete a switchover with an incorrect error about a missing replication user.

#   Replication user exists ... FAIL
Candidate slave is missing replication user.
# Errors found. Switchover aborted.

In _check_candidate_eligibility() called from _check_switchover_prerequisites() it assumes that the last returned user is the replication user instead of explicitly checking the user specified in the command.

get_rpl_users() in replication.py executes the following statement

SELECT user, host, authentication_string = '' as has_password FROM mysql.user WHERE repl_slave_priv = 'Y';

Then the code examines the values of the last entry found even though the user in that row is unrelated to the requested command.

            res = s_candidate.get_rpl_users()
            l = len(res)
            user, host, _ = res[l - 1]

In user_host_exists() it checks that user/host and fails if its host name does not match the name of the slave.

        res = self.exec_query("SELECT host FROM mysql.user WHERE user = '%s' "
                              "AND '%s' LIKE host " % (user, host_or_ip))

How to repeat:
1. On the master, add a local user with REPLICATION SLAVE privileges that will be last in the table

GRANT REPLICATION SLAVE on *.* TO zzzzz@localhost IDENTIFIED BY 'password';

2. Verify that the new entry is the last one returned

SELECT user, host, authentication_string = '' as has_password FROM mysql.user WHERE repl_slave_priv = 'Y';

3. Execute the switchover command

mysqlrpladmin --verbose --demote-master --discover-slaves-login=SUSR_rpladmin:'password' --new-master=.myrpladmin.cnf/[silky-skunk] --master=.myrpladmin.cnf[green-darkness] --rpl-user=SUSR_Repl:'password' --log=mysqlrpladmin.log switchover

4. Observe failure

#   Replication user exists ... FAIL
Candidate slave is missing replication user.
# Errors found. Switchover aborted.

Suggested fix:
Fix _check_candidate_eligibility() to check the specified replication user instead of some arbitrary user that exists and has REPLICATION SLAVE privileges.
[4 Jan 2017 5:56] monty solomon
The code is calling the wrong get_rpl_users() method.

There is one in the replication class and one in the topology class.