MySQL Bugs: #69857: mysqlrpladmin says "health: Slave is not connected to master."

Bug #69857	mysqlrpladmin says "health: Slave is not connected to master."
Submitted:	27 Jul 2013 20:26	Modified:	28 Jul 2014 18:04
Reporter:	Shahriyar Rzayev	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Utilities	Severity:	S1 (Critical)
Version:	5.6.12, 5.6.17, 1.4.3	OS:	Linux (Centos 6.4, 6.5)
Assigned to:		CPU Architecture:	Any

Description:
The issue is simple.
i have a simple replication between my localhost and a host in virtualbox.
Both Servers are Centos 6.4 and both mysql versions are 5.6.12 with GTID enabled.
The usage of mysqlrpladmin:
[root@localhost ~]# mysqlrpladmin --master=root:12345@localhost --slaves=remote:12345@192.168.1.4 --format=vertical health
# Checking privileges.
#
# Replication Topology Health:
*************************       1. row *************************
      host: localhost
      port: 3306
      role: MASTER
     state: UP
 gtid_mode: ON
    health: OK
*************************       2. row *************************
      host: 192.168.1.4
      port: 3306
      role: SLAVE
     state: WARN
 gtid_mode:  
    health: Slave is not connected to master.
2 rows.
# ...done.

The usage of mysqlrplcheck:

[root@localhost ~]# mysqlrplcheck --master=root:12345@localhost:3306 --slave=remote:12345@192.168.1.4:3306 -vv
# master on localhost: ... connected.
# slave on 192.168.1.4: ... connected.
Test Description                                                     Status
---------------------------------------------------------------------------
Checking for binary logging on master                                [pass]
Are there binlog exceptions?                                         [pass]
Replication user exists?                                             [pass]
Checking server_id values                                            [pass]

 master id = 1
  slave id = 2

Checking server_uuid values                                          [pass]

 master uuid = 4fbde670-f4e3-11e2-ba65-2089846422ad
  slave uuid = 8917b828-f6fc-11e2-8815-080027d5492c

Is slave connected to master?                                        [WARN]
Check master information file                                        [pass]

#
# Master information file: 
#
               Master_Log_File : mysql-bin.000007
           Read_Master_Log_Pos : 191
                   Master_Host : 192.168.1.3
                   Master_User : repl
               Master_Password : 12345
                   Master_Port : 3306
                 Connect_Retry : 60
            Master_SSL_Allowed : 0
            Master_SSL_CA_File : 
            Master_SSL_CA_Path : 
               Master_SSL_Cert : 
             Master_SSL_Cipher : 
                Master_SSL_Key : 
 Master_SSL_Verify_Server_Cert : 0
                     Heartbeat : 1800
                          Bind : 
            Ignored_server_ids : 0
                          Uuid : 4fbde670-f4e3-11e2-ba65-2089846422ad
                   Retry_count : 86400
                       SSL_CRL : 
                  SSL_CRL_Path : 
         Enabled_auto_position : 1

Checking InnoDB compatibility                                        [pass]
Checking storage engines compatibility                               [pass]
Checking lower_case_table_names settings                             [pass]

  Master lower_case_table_names: 1
   Slave lower_case_table_names: 1

Checking slave delay (seconds behind master)                         [pass]
# ...done.

But actually replication is working properly and it is connected. Slave status:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.3
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000007
          Read_Master_Log_Pos: 356
               Relay_Log_File: mysql-relay-bin.000008
                Relay_Log_Pos: 582
        Relay_Master_Log_File: mysql-bin.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 356
              Relay_Log_Space: 1052
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: 4fbde670-f4e3-11e2-ba65-2089846422ad
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 4fbde670-f4e3-11e2-ba65-2089846422ad:4-5:9:16
            Executed_Gtid_Set: 4fbde670-f4e3-11e2-ba65-2089846422ad:1-3:6-8:10-16,
8917b828-f6fc-11e2-8815-080027d5492c:1-7
                Auto_Position: 1
1 row in set (0.00 sec)

And there is no problem with applying logs to slave. tested it several times.
Also SHOW SLAVE HOSTS show this slave :

mysql> show slave hosts;
+-----------+------+------+-----------+--------------------------------------+
| Server_id | Host | Port | Master_id | Slave_UUID                           |
+-----------+------+------+-----------+--------------------------------------+
|         2 |      | 3306 |         1 | 8917b828-f6fc-11e2-8815-080027d5492c |
+-----------+------+------+-----------+--------------------------------------+
1 row in set (0.00 sec)

slave connected to master and even working well with no errors but mysqlrpladmin says: Slave is not connected to master. why?

How to repeat:
Nothing special to do. 2 Fresh Centos 6.4 and 2 Fresh MySQL 5.6.12 setup GTID based replication.

Suggested fix:
i have no idea

What version of MySQL Utilities are you using?

Please provide output of the following:
 
mysqlrpladmin --version

Also, please provide the description of how replication was setup. Did you use mysqlreplicate or do it by hand?

I use the latest version of utilities download from download page and installed using rpm:
mysql-utilities-1.3.3-1.el6.noarch.rpm

[root@localhost ~]# mysqlrpladmin --version
MySQL Utilities mysqlrpladmin version 1.3.3

I setup replication manually.
my.cnf from MASTER:

# BINARY LOGGING #

server_id                      = 1
log_bin                        = /var/lib/mysql/data/mysql-bin
log_bin_index                  = /var/lib/mysql/data/mysql-bin
expire_logs_days               = 14
sync_binlog                    = 1
binlog_format                  = row
gtid-mode                      = on
enforce-gtid-consistency       = true
master-info-repository         = TABLE
relay-log-info-repository      = TABLE
slave-parallel-workers         = 2
binlog-checksum                = CRC32
master-verify-checksum         = 1
slave-sql-verify-checksum      = 1
binlog-rows-query-log_events   = 1
log_slave_updates              = 1

my.cnf from SLAVE:

# BINARY LOGGING  and Replication for Slave setup with GTID#

server_id                      = 2
log_bin                        = /var/lib/mysql/data/mysql-bin
log_bin_index                  = /var/lib/mysql/data/mysql-bin
expire_logs_days               = 14
sync_binlog                    = 1
binlog_format                  = row
relay_log                      = /var/lib/mysql/data/mysql-relay-bin
log_slave_updates              = 1
read_only                      = 1
gtid-mode                      = on
enforce-gtid-consistency       = true
master-info-repository         = TABLE
relay-log-info-repository      = TABLE
slave-parallel-workers         = 2
binlog-checksum                = CRC32
master-verify-checksum         = 1
slave-sql-verify-checksum      = 1
binlog-rows-query-log_events   = 1

Dear experts,
Is it possible to say something about this report?
Is it considered to be closed? or there is some more information needed?

Hi.
We have found a way to reproduce the problem. We discovered that
if the loopback+hostname entry was not present in the /etc/hosts
file, the error is displayed. However, if you add the hostname of
the machine with and without suffix to loopback
address (127.0.0.1) line in the /etc/hosts file, the error "Slave
is not connected to master" is no longer shown. For instance, our
hostname was centos1.localdomain so we needed to add both centos1 and
centos1.localdomain to the 127.0.0.1 line on our /etc/hosts file.

Can you please test this on your system to see if it fixes the problem?

Sure, i will setup a replication with latest MySQL 5.6.17 and will test on it.
I will inform you about results.

Slave Server information:

[root@linuxsrv4 ~]# hostname
linuxsrv4
[root@linuxsrv4 ~]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=linuxsrv4
GATEWAY=192.168.1.1

I have added a hostname at the end of line as you say:

[root@linuxsrv4 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 linuxsrv4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6 linuxsrv4

But same thing:

[root@linuxsrv3 ~]# mysqlrpladmin --master=root:12345@localhost --slaves=remote:'$Slavepass45#'@192.168.1.88 --format=vertical health
# Checking privileges.
#
# Replication Topology Health:
*************************       1. row *************************
      host: localhost
      port: 3306
      role: MASTER
     state: UP
 gtid_mode: ON
    health: OK
*************************       2. row *************************
      host: 192.168.1.88
      port: 3306
      role: SLAVE
     state: WARN
 gtid_mode:  
    health: Slave is not connected to master.
2 rows.
# ...done.

Slave Status:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.77
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000024
          Read_Master_Log_Pos: 333
               Relay_Log_File: mysql-relay-bin.000002
                Relay_Log_Pos: 456
        Relay_Master_Log_File: mysql-bin.000024
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 333
              Relay_Log_Space: 660
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: 1f29cdda-d5ac-11e3-b42e-0800274da480
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 1f29cdda-d5ac-11e3-b42e-0800274da480:9675
            Executed_Gtid_Set: 1f29cdda-d5ac-11e3-b42e-0800274da480:9675,
5a70670a-d99e-11e3-8dea-080027304b04:1-41
                Auto_Position: 0
1 row in set (0,00 sec)

Also tested with 1.4.2 rc same thing:

[root@linuxsrv3 ~]# mysqlrpladmin --master=root:12345@localhost --slaves=remote:'$Slavepass45#'@192.168.1.88 --format=vertical health
# Checking privileges.
#
# Replication Topology Health:
*************************       1. row *************************
      host: localhost
      port: 3306
      role: MASTER
     state: UP
 gtid_mode: ON
    health: OK
*************************       2. row *************************
      host: 192.168.1.88
      port: 3306
      role: SLAVE
     state: WARN
 gtid_mode:  
    health: Slave is not connected to master.
2 rows.
# ...done.

Hi Shahriyar, 
We would like to thank you for your help so far and if you don't mind,
ask you to test yet another solution. We think this issue is
DNS related. Can you please open a terminal on your master (where you
run the utility) and run the following command: $ host linuxsrv4

Please verify if you see an error message stating that the linuxsrv4 host was
not found.

In order to fix this issue you must add a new entry to the /etc/hosts
file (on the master machine) mapping the IP and hostname of the master
(192.168.1.77 and linuxsrv4, according to your previous comments). At
the end, your file should look like this:

[root@linuxsrv4 ~]# cat /etc/hosts
127.0.0.1    localhost localhost.localdomain localhost4 localhost4.localdomain4 linuxsrv4
::1    localhost localhost.localdomain localhost6 localhost6.localdomain6 linuxsrv4
192.168.1.77    linuxsrv4

Afterwards, save the file, try running the utility again and
please let us know if it works.

Best Regards

As you say:
On Slave server:

[root@linuxsrv4 ~]# host linuxsrv4
Host linuxsrv4 not found: 3(NXDOMAIN)

[root@linuxsrv4 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 linuxsrv4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6 linuxsrv4
192.168.1.88 linuxsrv4

On Master server:

[root@linuxsrv3 ~]# mysqlrpladmin --master=root:12345@localhost --slaves=remote:'$Slavepass45#'@192.168.1.88 --format=vertical health
# Checking privileges.
#
# Replication Topology Health:
*************************       1. row *************************
      host: localhost
      port: 3306
      role: MASTER
     state: UP
 gtid_mode: ON
    health: OK
*************************       2. row *************************
      host: 192.168.1.88
      port: 3306
      role: SLAVE
     state: WARN
 gtid_mode:  
    health: Slave is not connected to master.
2 rows.
# ...done.

If it is valuable here is information from nmap:

[root@linuxsrv4 ~]# nmap linuxsrv4

Starting Nmap 5.51 ( http://nmap.org ) at 2014-05-21 12:46 AZST
Nmap scan report for linuxsrv4 (127.0.0.1)
Host is up (0.0000040s latency).
Other addresses for linuxsrv4 (not scanned): 127.0.0.1 192.168.1.88
rDNS record for 127.0.0.1: localhost
Not shown: 996 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
111/tcp  open  rpcbind
631/tcp  open  ipp
3306/tcp open  mysql

Nmap done: 1 IP address (1 host up) scanned in 0.08 seconds

And from Telnet:

[root@linuxsrv4 ~]# telnet linuxsrv4 3306
Trying ::1...
Connected to linuxsrv4.

Also tested with latest GA release of mysql utilities 1.4.3:

[root@linuxsrv3 ~]# mysqlrpladmin --master=root:12345@localhost --slaves='remote2':'Pass@123#'@'192.168.1.88' health --format=vertical -vvv
# Checking privileges.
# Attempting to contact localhost ... Success
# Attempting to contact 192.168.1.88 ... Server is reachable
#
# Replication Topology Health:
*************************       1. row *************************
            host: localhost
            port: 3306
            role: MASTER
           state: UP
       gtid_mode: ON
          health: OK
         version: 5.6.17-log
 master_log_file: mysql-bin.000033
  master_log_pos: 191
       IO_Thread: 
      SQL_Thread: 
     Secs_Behind: 
 Remaining_Delay: 
    IO_Error_Num: 
        IO_Error: 
   SQL_Error_Num: 
       SQL_Error: 
    Trans_Behind: 
*************************       2. row *************************
            host: 192.168.1.88
            port: 3306
            role: SLAVE
           state: WARN
       gtid_mode:  
          health: Slave is not connected to master.
         version: 
 master_log_file: 
  master_log_pos: 
       IO_Thread: 
      SQL_Thread: 
     Secs_Behind: 
 Remaining_Delay: 
    IO_Error_Num: 
        IO_Error: 
   SQL_Error_Num: 
       SQL_Error: 
    Trans_Behind: 
2 rows.
# ...done.

And used with another option:

[root@linuxsrv3 ~]# mysqlrpladmin --master=root:12345@localhost --discover-slaves-login='remote2':'Pass@123#'  health -vvv --format=vertical
# Discovering slaves for master at localhost:3306

WARNING: There are slaves that had connection errors.
# Checking privileges.
# Attempting to contact localhost ... Success
#
# Replication Topology Health:
*************************       1. row *************************
            host: localhost
            port: 3306
            role: MASTER
           state: UP
       gtid_mode: ON
          health: OK
         version: 5.6.17-log
 master_log_file: mysql-bin.000033
  master_log_pos: 191
       IO_Thread: 
      SQL_Thread: 
     Secs_Behind: 
 Remaining_Delay: 
    IO_Error_Num: 
        IO_Error: 
   SQL_Error_Num: 
       SQL_Error: 
    Trans_Behind: 
1 row.
# ...done.

Hi Shahriyar, 
We've noticed that you are editing the /etc/hosts file on the slave machine,
however we meant for you to edit the /etc/hosts file on the master server
adding the master IP along with its hostname.
There is no need to edit the /etc/hosts file on the slave
as the utility is running on the master. Can you please try it and 
let us know if it works?

Thank you.

Added as you say on master server:

[root@linuxsrv1 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 linuxsrv1
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6 linuxsrv1
192.168.1.77 linuxsrv1

Now it is working:

[root@linuxsrv1 ~]# mysqlrpladmin --master=root:12345@localhost --slaves=remote:12345@192.168.1.88 --format=vertical health
# Checking privileges.
#
# Replication Topology Health:
*************************       1. row *************************
      host: localhost
      port: 3306
      role: MASTER
     state: UP
 gtid_mode: ON
    health: OK
*************************       2. row *************************
      host: 192.168.1.88
      port: 3306
      role: SLAVE
     state: UP
 gtid_mode: ON
    health: OK
2 rows.
# ...done.

Was fixed by a combination of other bugs. Cannot repeat with release-1.4.4.