MySQL Bugs: #89954: InnoDB cluster status inconsistent with group replication

Bug #89954	InnoDB cluster status inconsistent with group replication
Submitted:	8 Mar 2018 6:01	Modified:	13 Apr 2018 3:09
Reporter:	Tony Wen	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Group Replication	Severity:	S2 (Serious)
Version:	mysql-community-server-minimal-5.7.20-1.	OS:	Oracle Linux (3.10.0-693.11.6.el7.x86_64)
Assigned to:		CPU Architecture:	x86

Description:
Here I have a 3-node InnoDB cluster. After turn off the primary node, cluster didn't failover although group replication defected the unreachable node and removed it from group. After that, my cluster turned to out of service. Would you help to advise how this may happened? What should I do with this? Thanks in advance!

This is group members:
mysql-sql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST           | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | 2a18ea38-2289-11e8-b237-3e2563ac2dea | t-debug-mysql-1.mysql |        3306 | ONLINE       |
| group_replication_applier | d433ad31-2288-11e8-aed7-1e677f2e236d | t-debug-mysql-2.mysql |        3306 | ONLINE       |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
2 rows in set (0.00 sec)

This is cluster status:
mysql-js> dba.getCluster().status();
{
    "clusterName": "testcluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "t-debug-mysql-0.mysql:3306",
        "status": "OK_NO_TOLERANCE",
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active",
        "topology": {
            "t-debug-mysql-0.mysql:3306": {
                "address": "t-debug-mysql-0.mysql:3306",
                "mode": "R/W",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "t-debug-mysql-1.mysql:3306": {
                "address": "t-debug-mysql-1.mysql:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "(MISSING)"
            },
            "t-debug-mysql-2.mysql:3306": {
                "address": "t-debug-mysql-2.mysql:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            }
        }
    }
}

t-debug-mysql-0 is the primary node that I turned off. t-debug-mysql-1 is an active node but shown as "MISSING". 

This is part of mysql log shows t-debug-mysql-0 has been removed:
2018-03-08T05:15:13.003964Z 144 [Note] Start binlog_dump to master_thread_id(144) slave_server(100), pos(, 4)
2018-03-08T05:15:13.054319Z 0 [Note] Plugin group_replication reported: 'The member with address t-debug-mysql-0.mysql:3306 was declared online within the replication group'
2018-03-08T05:15:31.228126Z 158 [Note] Aborted connection 158 to db: 'unconnected' user: 'root' host: '192.168.71.11' (Got an error reading communication packets)
2018-03-08T05:16:13.004465Z 144 [Note] Aborted connection 144 to db: 'unconnected' user: 'root' host: '192.168.181.9' (failed on flush_net())
2018-03-08T05:17:47.412650Z 0 [Warning] Plugin group_replication reported: 'Member with address t-debug-mysql-0.mysql:3306 has become unreachable.'
2018-03-08T05:17:47.498743Z 0 [Note] Plugin group_replication reported: 'getstart group_id 2e0493c8'
2018-03-08T05:17:48.413407Z 0 [Warning] Plugin group_replication reported: 'Members removed from the group: t-debug-mysql-0.mysql:3306'
2018-03-08T05:17:48.413492Z 0 [Note] Plugin group_replication reported: 'Group membership changed to t-debug-mysql-1.mysql:3306, t-debug-mysql-2.mysql:3306 on view 15204833248407632:20.'

How to repeat:
Steps I did:
1. Create 3-member group replication on 3 nodes.
2. Create InnoDB cluster by adopting the existent group replication.
3. Turn off the primary node.

This is just got once.

Hi,
Thanks for your report. Few questions
 - Was you able to reproduce this second time?
 - If you was not able to reproduce (from what I read in the report), are you trying to reproduce and everything behaved properly or you didn't try to reproduce?
 - when you say "turn off master", how exactly you do this? Do you shutdown mysqld or you kill -9 mysqld, or you unplug the network cable, or you unplug the power cable, or this is a vm that you pause/stop/suspend or ? Can you please be very specific on how you "turn it off"
 - Do you have a full set of logs of the incident as doing what you stated in that order does not reproduce the issue on my setup?

kind regards
Bogdan

Hi, 
Thanks for your quick reply.  For the questions:
 - Was you able to reproduce this second time?
I didn't reproduced exactly the same failure. But, I did see another similar situation(by similar I mean failover failed and cluster status turned inconsistent with replication_group_member table). 

Cluster status shown from a slave node,t-baga1sh8sgc000ab24p0-mysql-2:
{
    "clusterName": "testcluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "t-baga1sh8sgc000ab24p0-mysql-1.mysql:3306",
        "status": "OK_NO_TOLERANCE",
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active",
        "topology": {
            "t-baga1sh8sgc000ab24p0-mysql-0.mysql:3306": {
                "address": "t-baga1sh8sgc000ab24p0-mysql-0.mysql:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "(MISSING)"
            },
            "t-baga1sh8sgc000ab24p0-mysql-1.mysql:3306": {
                "address": "t-baga1sh8sgc000ab24p0-mysql-1.mysql:3306",
                "mode": "R/W",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "t-baga1sh8sgc000ab24p0-mysql-2.mysql:3306": {
                "address": "t-baga1sh8sgc000ab24p0-mysql-2.mysql:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            }
        }
    }
}

And group member show on the slave node:
+---------------------------+--------------------------------------+--------------------------------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST                          | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------------------------------+-------------+--------------+
| group_replication_applier | 1cc2a6c7-2279-11e8-a5ed-3a5e57d2113e | t-baga1sh8sgc000ab24p0-mysql-0.mysql |        3306 | ONLINE       |
| group_replication_applier | 24d63ef0-2279-11e8-a713-a26a8ec5003c | t-baga1sh8sgc000ab24p0-mysql-2.mysql |        3306 | ONLINE       |
+---------------------------+--------------------------------------+--------------------------------------+-------------+--------------+

Group info shown on the primary node,baga1sh8sgc000ab24p0-mysql-1.
+---------------------------+--------------------------------------+--------------------------------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST                          | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------------------------------+-------------+--------------+
| group_replication_applier | 156673b3-2279-11e8-a6fb-3283baa53926 | t-baga1sh8sgc000ab24p0-mysql-1.mysql |        3306 | OFFLINE      |
+---------------------------+--------------------------------------+--------------------------------------+-------------+--------------+
//note: since it "think" it's not in a cluster, cluster info cannot given by it.

 - If you was not able to reproduce (from what I read in the report), are you trying to reproduce and everything behaved properly or you didn't try to reproduce?
I didn't recreate it exactly the same. 

 - when you say "turn off master", how exactly you do this? Do you shutdown mysqld or you kill -9 mysqld, or you unplug the network cable, or you unplug the power cable, or this is a vm that you pause/stop/suspend or ? Can you please be very specific on how you "turn it off"
All the three nodes are deployed as 3 pods in a 3-node kubernetes environment. I turned the master off by killing the master pod. For your question, I don't know how exactly the shut down process is performed when the pod is killed. Can you give some introduction about the sequences in different cases(shutdown mysqld, kill -9 mysqld, unplug cable, etc..)? What kind of log or message may help to identify what exactly happened? Thanks!

 - Do you have a full set of logs of the incident as doing what you stated in that order does not reproduce the issue on my setup?
I'll upload the mysql log files of the 3 nodes(t-debug-mysql-0,t-debug-mysql-1,t-debug-mysql-2).

mysql logs of the

Attachment: t-debug.failover.log.tar (application/x-tar, text), 330.00 KiB.

mysql logs of second similar failure

Attachment: t-baga1sh8sgc000ab24p0.log.tar (application/x-tar, text), 1.42 MiB.

A latest status update.  We can 100% recreate this issue now. 

In our kuerbernets(k8s) cluster environment, we found when we make mysql primary node unreachable by stopping k8s service on the node where mysql primary node is running, the issue can be recreated continuously. To my understanding, stopping of k8s service didn't trigger mysql cluster failover. I don't know why. Would you help to check on this?

This is the cluster info:
{
    "clusterName": "testcluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "tenant1-mysql-0.mysql:3306",
        "status": "OK",
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
        "topology": {
            "tenant1-mysql-0.mysql:3306": {
                "address": "tenant1-mysql-0.mysql:3306",
                "mode": "R/W",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "tenant1-mysql-1.mysql:3306": {
                "address": "tenant1-mysql-1.mysql:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "tenant1-mysql-2.mysql:3306": {
                "address": "tenant1-mysql-2.mysql:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            }
        }
    }
}

mysql node info:
tenant1-mysql-0                  192.168.21.53   
tenant1-mysql-1                  192.168.181.22  
tenant1-mysql-2                  192.168.71.21   

Log of the 2 slaves, tenant1-mysql-1 and tenant1-mysql-2, will be uploaded. 

For tenant1-mysql-0, it may rejoin the cluster after restart. But, since we're trying to figure out why the failover didn't happen, I think it's better to keep it stopped for now. So, I cannot provide log of it yet.  If it's required, I can start it and upload the log later. Please let know.

log files of 2 mysql slave nodes

Attachment: tenant1.log.tar (application/x-tar, text), 58.00 KiB.

Hi,

I believe this is enough information for now. 
I need to check what mysqld sees when you kill the pod so I can try to recreate the problem locally without kubernetes system.

all best
Bogdan

Hi,

Is there any news about this issue?  Please let me know if there any other information you want me to provide. Thanks!

Tony

Hi,

Question about recovery. In current situation(cluster is stuck to failover), how can I manually force it to failover/switchover? Asking this because I'm thinking I may have to get the cluster back again (to provide DB service) if the origin primary node gone forever.  Will remove the missed primary node( like described in this doc https://dev.mysql.com/doc/refman/5.7/en/mysql-innodb-cluster-working-with-cluster.html#man...) work for me? 

Thanks!
Tony Wen

Hi,

I finally managed to reproduce the problem, so issue is verified now. We'll see what the GR team will have to say about this but I think that should work for you, yes.

all best
Bogdan

Hi, Bogdan

Thanks for the confirmation. Please let me know when you get GR team's reply.

About the recovery, removing missed node/instance does not work. It told me I cannot do write operation on the InnoDB cluster on a read only instance. In this case, I only have the 2 slave nodes which are read only.  I've tried to change it by dba.configureLocalInstance('root@localhost:3306',{clearReadOnly: true}). But it told me clearReadOnly is not valid. 

Beside the upper operation, I also tried mysqlrpladmin. No luck either. The command I used: mysqlrpladmin --slaves=root:passw0rd@tenant1-mysql-1.mysql:3306,root:passw0rd@tenant1-mysql-2.mysql:3306 failover.  The slaves are both the ones shown in cluster status(). Not sure why it always replied "No candidate found for failover.".

So far, I still don't know how to make the cluster recovery. Please help to advice. Thanks!

Posted by developer:
 
Hi, could you try with MySQL Shell 8.0.4 or later? It appears this issue has been fixed.

Hi, Alfredo,

You mentioned "MySQL Shell 8.0.4". Do you mean using the latest version of mysql shell can make the cluster recovery from current stuck state? Asking this because I don't think the shell can help to trigger mysql failover. Please correct me if there is any misunderstanding. Thanks!

Tony

I can say that with mysql shell 8.0.4 rc and mysql server 8.0.4 rc I can't reproduce this issue... now it was not easy to reproduce in the first place so can't say with 100% it is solved but imo it is.

bogdan

Hi

I've tried to install 8.0.4 in my docker file. But, mysql failed to be started. Would you please help to take a look? Thanks!

mysql error log:
2018-03-20T13:56:12.254220Z 0 [Warning] [MY-011068] The syntax 'expire-logs-days' is deprecated and will be removed in a future release. Please use binlog_expire_logs_seconds instead.
2018-03-20T13:56:12.256880Z 0 [System] [MY-010116] /usr/sbin/mysqld (mysqld 8.0.4-rc-log) starting as process 188 ...
2018-03-20T13:56:13.614692Z 0 [Warning] [MY-010015] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
2018-03-20T13:56:13.616758Z 0 [Warning] [MY-010069] Failed to set up SSL because of the following SSL library error: SSL_CTX_set_default_verify_paths failed
2018-03-20T13:56:13.619342Z 0 [Warning] [MY-010441] Failed to open optimizer cost constant tables
2018-03-20T13:56:13.619518Z 0 [ERROR] [MY-001146] Table 'mysql.component' doesn't exist
2018-03-20T13:56:13.619550Z 0 [Warning] [MY-003543] The mysql.component table is missing or has an incorrect definition.
2018-03-20T13:56:13.619647Z 0 [ERROR] [MY-011071] unknown variable 'partition=OFF'
2018-03-20T13:56:13.619658Z 0 [Warning] [MY-010952] The privilege system failed to initialize correctly. If you have upgraded your server, make sure you're executing mysql_upgrade to correct the issue.
2018-03-20T13:56:13.619779Z 0 [ERROR] [MY-010119] Aborting
2018-03-20T13:56:13.722496Z 2 [ERROR] [MY-011254] Plugin group_replication reported: 'Unable to start Group Replication. Replication applier infrastructure is not initialized since the server was started with --initialize or --initialize-insecure.'
2018-03-20T13:56:15.134050Z 0 [System] [MY-010910] /usr/sbin/mysqld: Shutdown complete.

DB data directory in container:
bash-4.2# ls -l /var/lib/mysql
total 145472
-rw-r----- 1 mysql mysql       56 Mar 20 07:56 auto.cnf
-rw-r----- 1 mysql mysql     2508 Mar 20 13:56 ib_buffer_pool
-rw-r----- 1 mysql mysql 33554432 Mar 20 13:56 ib_logfile0
-rw-r----- 1 mysql mysql 33554432 Mar 20 07:56 ib_logfile1
-rw-r----- 1 mysql mysql 33554432 Mar 20 07:56 ib_logfile2
-rw-r----- 1 mysql mysql 12582912 Mar 20 13:56 ibdata1
drwxr-x--- 2 mysql mysql        6 Mar 20 07:56 mysql
-rw-r----- 1 mysql mysql 14680064 Mar 20 13:56 mysql.ibd
-rw-r----- 1 mysql mysql     9327 Mar 20 13:56 mysqld.err
drwxr-x--- 2 mysql mysql     4096 Mar 20 07:56 performance_schema
-rw------- 1 mysql mysql     1676 Mar 20 07:56 private_key.pem
-rw-r--r-- 1 mysql mysql      452 Mar 20 07:56 public_key.pem
-rw-r----- 1 mysql mysql      170 Mar 20 07:56 t1-mysql-0-bin.000001
-rw-r----- 1 mysql mysql      170 Mar 20 07:57 t1-mysql-0-bin.000002
-rw-r----- 1 mysql mysql      170 Mar 20 13:43 t1-mysql-0-bin.000003
-rw-r----- 1 mysql mysql      170 Mar 20 13:45 t1-mysql-0-bin.000004
-rw-r----- 1 mysql mysql      170 Mar 20 13:49 t1-mysql-0-bin.000005
-rw-r----- 1 mysql mysql      170 Mar 20 13:56 t1-mysql-0-bin.000006
-rw-r----- 1 mysql mysql      144 Mar 20 13:56 t1-mysql-0-bin.index
-rw-r----- 1 mysql mysql 10485760 Mar 20 13:56 undo_001
-rw-r----- 1 mysql mysql 10485760 Mar 20 13:56 undo_002

my config will be upload in next comment. Thanks!

Tony

mysql config of my cluster instance

Attachment: my.cnf (application/octet-stream, text), 5.19 KiB.

Posted by developer:
 
Hi Tony,

1- Your error with MySQL and docker seem to be a MySQL installation issue. Please ensure you're installing MySQL correctly.

2- What do you mean with "stuck in recovery state"?

3- The original issue you reported was basically a display bug in the shell, where in some cases, the status of an instance that was taken down was swapped with the status of another instance in the cluster in output printed by the status() command. The correct and actual status of your cluster instances are the ones reported by your SELECT queries. 

4- What do you mean with "cluster didn't failover"? The cluster appears to have failed over correctly from the information you provided, despite the incorrect display from the shell. Also, mysqlrpladmin is not meant for InnoDB cluster, it is specific to traditional asynchronous replication.

Hi, Alfredo

Please see my answer inline. Thanks!

1- Your error with MySQL and docker seem to be a MySQL installation issue. Please ensure you're installing MySQL correctly.

I've just installed and started mysql-8.0.4-rc now.  But, when I try to connect with mysqlsh(version 8.0.4-rc), it always shows this error: 
Error during auto-completion cache update: The user specified as a definer ('mysql.infoschema'@'localhost') does not exist.  
Then, when I tried to create cluster with command:dba.createCluster("testcluster", {adoptFromGR: true}). It always return the upper error.

So, now I cannot create cluster with my group replication. Do you know why this happened? Anything I may set wrong? Please let me know if you want me provide any other information. 

2- What do you mean with "stuck in recovery state"?

Sorry, I didn't find "stuck in recovery state". If you mean "stuck to failover", by that I mean the failover did not happen as expected. In my understanding, if the failover happened, it would allow me to access DB even one node is unreachable. But, it's not. So, I think failover didn't happen.

3- The original issue you reported was basically a display bug in the shell, where in some cases, the status of an instance that was taken down was swapped with the status of another instance in the cluster in output printed by the status() command. The correct and actual status of your cluster instances are the ones reported by your SELECT queries. 

Sorry, I think this may not be my case. If it's only a display bug, the DB service should be accessable but it's not. 

4- What do you mean with "cluster didn't failover"? The cluster appears to have failed over correctly from the information you provided, despite the incorrect display from the shell. Also, mysqlrpladmin is not meant for InnoDB cluster, it is specific to traditional asynchronous replication.

Please see my reply of question 2.

Thanks!
Tony

Hi, Bogdan

I'm still struggling to setup cluster with 8.0.4. If it works, will related fix be backport to 5.7? 8.0 is for development. My manager concerns about using a development release. 

Thanks!
Tony

Hi, Alfredo and Bogdan

Would you please help to take a look at my questions? Thanks a lot!

Tony

Hi,  Alfredo and Bogdan

I’ve tried 8.0.4-rc. Killing kubelet service still cannot make cluster failover.  But, this time, use mysqlsh to get cluster status(by command dba.getCluster().getCluster() can give error now.  

FYI, this is the group replication members state. It still shown primary is online.
mysqlx: [Warning] Using a password on the command line interface can be insecure.
Creating a session to 'root@t1-mysql-0:3306'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 2230
Server version: 8.0.4-rc-log MySQL Community Server (GPL)
No default schema selected; type \use <schema> to set one.
+---------------------------+--------------------------------------+------------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST      | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+------------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 74c73a80-2f52-11e8-8ac1-0eda4fb6fa98 | t1-mysql-0.mysql |        3306 | ONLINE       | SECONDARY   | 8.0.4          |
| group_replication_applier | 8177d577-2f52-11e8-8d35-5e1b37405329 | t1-mysql-1.mysql |        3306 | ONLINE       | PRIMARY     | 8.0.4          |
| group_replication_applier | 9164c150-2f52-11e8-9120-5e3c05d5f239 | t1-mysql-2.mysql |        3306 | ONLINE       | SECONDARY   | 8.0.4          |
+---------------------------+--------------------------------------+------------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)

This is error I got when I tried to get cluster status:
mysqlx: [Warning] Using a password on the command line interface can be insecure.
Creating a session to 'root@t1-mysql-0:3306'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 2275
Server version: 8.0.4-rc-log MySQL Community Server (GPL)
No default schema selected; type \use <schema> to set one.
Dba.getCluster: Unable to find a primary member in the cluster: Unknown MySQL server host 't1-mysql-1.mysql' (0) (RuntimeError)

Best Regards!
Tony

Hi,  Alfredo and Bogdan

Any news? Would you help to reply my questions? Thanks!

Tony

OK, I was able to reproduce the issue with Tony's help, using his test kubernetes setup.

The test scenario is as follows:

- 3 kubernetes pods (that use docker under the hood), running mysqld
- they can all reach each other by hostname
- there's a GR group with all members, all members are ONLINE
- everything works fine

Now, if one of the pods (ie. the docker container) is stopped, then mysqld will be taken down with it. It is not shutdown gracefully. The container will be completely gone and the hostname associated to it will no longer be reachable or even resolvable. Trying to connect to that member from one of the remaining containers gives:

    MySQL Error 2005: No such host is known 't1-mysql-0.mysql'

But querying the PFS.replication_group_members table from any of the remaining members shows all members still ONLINE, even after several minutes later. It looks like GR never notices the missing instance, even if it's not supposed to be getting heartbeats or anything from it. 

If mysqld is SHUTDOWN cleanly, then the issue doesn't occur.

Actually, after digging a little further, it appears that the instance is still running and reachable by IP, despite hostname being unresolvable. 

Tony, please confirm and close the bug if you agree it's not a GR or InnoDB cluster issue.

Yes, the primary node still can be accessed through its IP. So this seems not an issue of mysql then. 
I'm going to close this bug. Thanks a lot for the help!

Tony

Closed as not a bug.