Bug #112193 All nodes elected as PRIMARY
Submitted: 28 Aug 2023 3:07 Modified: 10 Oct 2023 11:02
Reporter: zetang zeng (OCA) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Cluster: plugin Severity:S3 (Non-critical)
Version:5.7 OS:Any
Assigned to: CPU Architecture:Any

[28 Aug 2023 3:07] zetang zeng
Description:
We deployed a three nodes cluster. After some failures(include some network failures), all nodes become master which we don't think that met the expectation.

192.168.0.15 | CHANGED | rc=0 >>
{
    "clusterName": "myCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.0.15:3406", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE_PARTIAL", 
        "statusText": "Cluster is NOT tolerant to any failures. 2 members are not active.", 
        "topology": {
            "192.168.0.124:3406": {
                "address": "192.168.0.124:3406", 
                "instanceErrors": [
                    "ERROR: split-brain! Instance is not part of the majority group, but has state ONLINE", 
                    "WARNING: Instance is NOT a PRIMARY but super_read_only option is OFF."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "ONLINE", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "5.7.39"
            }, 
            "192.168.0.15:3406": {
                "address": "192.168.0.15:3406", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "5.7.39"
            }, 
            "192.168.0.16:3406": {
                "address": "192.168.0.16:3406", 
                "instanceErrors": [
                    "ERROR: split-brain! Instance is not part of the majority group, but has state ONLINE", 
                    "WARNING: Instance is NOT a PRIMARY but super_read_only option is OFF."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "ONLINE", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "5.7.39"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "192.168.0.15:3406"
}WARNING: Using a password on the command line interface can be insecure.
192.168.0.16 | CHANGED | rc=0 >>
{
    "clusterName": "myCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.0.16:3406", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE_PARTIAL", 
        "statusText": "Cluster is NOT tolerant to any failures. 2 members are not active.", 
        "topology": {
            "192.168.0.124:3406": {
                "address": "192.168.0.124:3406", 
                "instanceErrors": [
                    "ERROR: split-brain! Instance is not part of the majority group, but has state ONLINE", 
                    "WARNING: Instance is NOT a PRIMARY but super_read_only option is OFF."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "ONLINE", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "5.7.39"
            }, 
            "192.168.0.15:3406": {
                "address": "192.168.0.15:3406", 
                "instanceErrors": [
                    "ERROR: split-brain! Instance is not part of the majority group, but has state ONLINE", 
                    "WARNING: Instance is NOT a PRIMARY but super_read_only option is OFF."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "ONLINE", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "5.7.39"
            }, 
            "192.168.0.16:3406": {
                "address": "192.168.0.16:3406", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "5.7.39"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "192.168.0.16:3406"
}WARNING: Using a password on the command line interface can be insecure.
192.168.0.124 | CHANGED | rc=0 >>
{
    "clusterName": "myCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.0.124:3406", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE_PARTIAL", 
        "statusText": "Cluster is NOT tolerant to any failures. 2 members are not active.", 
        "topology": {
            "192.168.0.124:3406": {
                "address": "192.168.0.124:3406", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "5.7.39"
            }, 
            "192.168.0.15:3406": {
                "address": "192.168.0.15:3406", 
                "instanceErrors": [
                    "ERROR: split-brain! Instance is not part of the majority group, but has state ONLINE", 
                    "WARNING: Instance is NOT a PRIMARY but super_read_only option is OFF."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "ONLINE", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "5.7.39"
            }, 
            "192.168.0.16:3406": {
                "address": "192.168.0.16:3406", 
                "instanceErrors": [
                    "ERROR: split-brain! Instance is not part of the majority group, but has state ONLINE", 
                    "WARNING: Instance is NOT a PRIMARY but super_read_only option is OFF."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "ONLINE", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "5.7.39"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "192.168.0.124:3406"
}WARNING: Using a password on the command line interface can be insecure.

How to repeat:
Not clear yet.
[28 Aug 2023 3:07] zetang zeng
Some logs:

192.168.0.15 | CHANGED | rc=0 >>
2023-08-23T15:54:31.504885Z 807548 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2023-08-23T15:54:31.504898Z 807553 [Note] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log '/data00/mysql/log3406/mysql-relay-bin-group_replication_applier.000023' position: 449633898
2023-08-23T15:54:31.506321Z 0 [Note] Plugin group_replication reported: 'XCom protocol version: 3'
2023-08-23T15:54:31.506434Z 0 [Note] Plugin group_replication reported: 'XCom initialized and ready to accept incoming connections on port 34061'
2023-08-23T15:54:32.516061Z 807556 [Note] Plugin group_replication reported: 'Only one server alive. Declaring this server as online within the replication group'
2023-08-23T15:54:32.516102Z 0 [Note] Plugin group_replication reported: 'Group membership changed to 192.168.0.15:3406 on view 16928060725158441:1.'
2023-08-23T15:54:32.516405Z 0 [Note] Plugin group_replication reported: 'This server was declared online within the replication group'
2023-08-23T15:54:32.516443Z 0 [Note] Plugin group_replication reported: 'A new primary with address 192.168.0.15:3406 was elected, enabling conflict detection until the new primary applies all relay logs.'
2023-08-23T15:54:32.516469Z 807558 [Note] Plugin group_replication reported: 'This server is working as primary member.'
2023-08-23T15:55:02.744008Z 807550 [Note] Plugin group_replication reported: 'Primary had applied all relay logs, disabled conflict detection'
192.168.0.16 | CHANGED | rc=0 >>
2023-08-23T15:54:31.538297Z 10405735 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2023-08-23T15:54:31.538319Z 10405740 [Note] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log '/data00/mysql/log3406/mysql-relay-bin-group_replication_applier.000002' position: 9838
2023-08-23T15:54:31.540407Z 0 [Note] Plugin group_replication reported: 'XCom protocol version: 3'
2023-08-23T15:54:31.540672Z 0 [Note] Plugin group_replication reported: 'XCom initialized and ready to accept incoming connections on port 34061'
2023-08-23T15:54:32.552962Z 10405743 [Note] Plugin group_replication reported: 'Only one server alive. Declaring this server as online within the replication group'
2023-08-23T15:54:32.553022Z 0 [Note] Plugin group_replication reported: 'Group membership changed to 192.168.0.16:3406 on view 16928060725526791:1.'
2023-08-23T15:54:32.553371Z 0 [Note] Plugin group_replication reported: 'This server was declared online within the replication group'
2023-08-23T15:54:32.553407Z 0 [Note] Plugin group_replication reported: 'A new primary with address 192.168.0.16:3406 was elected, enabling conflict detection until the new primary applies all relay logs.'
2023-08-23T15:54:32.553433Z 10405745 [Note] Plugin group_replication reported: 'This server is working as primary member.'
2023-08-23T15:55:02.800824Z 10405737 [Note] Plugin group_replication reported: 'Primary had applied all relay logs, disabled conflict detection'
192.168.0.124 | CHANGED | rc=0 >>
2023-08-23T15:54:31.527961Z 807190 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2023-08-23T15:54:31.527976Z 807195 [Note] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log '/data00/mysql/log3406/mysql-relay-bin-group_replication_applier.000029' position: 791690528
2023-08-23T15:54:31.529363Z 0 [Note] Plugin group_replication reported: 'XCom protocol version: 3'
2023-08-23T15:54:31.529671Z 0 [Note] Plugin group_replication reported: 'XCom initialized and ready to accept incoming connections on port 34061'
2023-08-23T15:54:32.539871Z 807198 [Note] Plugin group_replication reported: 'Only one server alive. Declaring this server as online within the replication group'
2023-08-23T15:54:32.539964Z 0 [Note] Plugin group_replication reported: 'Group membership changed to 192.168.0.124:3406 on view 16928060725396621:1.'
2023-08-23T15:54:32.540213Z 0 [Note] Plugin group_replication reported: 'This server was declared online within the replication group'
2023-08-23T15:54:32.540244Z 0 [Note] Plugin group_replication reported: 'A new primary with address 192.168.0.124:3406 was elected, enabling conflict detection until the new primary applies all relay logs.'
2023-08-23T15:54:32.540269Z 807200 [Note] Plugin group_replication reported: 'This server is working as primary member.'
2023-08-23T15:55:03.063001Z 807192 [Note] Plugin group_replication reported: 'Primary had applied all relay logs, disabled conflict detection'
[28 Aug 2023 3:10] zetang zeng
192.168.0.15 | CHANGED | rc=0 >>
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 4e35597a-351e-11ee-8498-00163e5c03bd
 MEMBER_HOST: 192.168.0.15
 MEMBER_PORT: 3406
MEMBER_STATE: ONLINE

192.168.0.16 | CHANGED | rc=0 >>
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 4e29a63c-351e-11ee-bccf-00163e62338d
 MEMBER_HOST: 192.168.0.16
 MEMBER_PORT: 3406
MEMBER_STATE: ONLINE

192.168.0.124 | CHANGED | rc=0 >>
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 4e5144a3-351e-11ee-b663-00163e7d1e0e
 MEMBER_HOST: 192.168.0.124
 MEMBER_PORT: 3406
MEMBER_STATE: ONLINE
[10 Oct 2023 11:02] MySQL Verification Team
Hello zetang zeng,

Thank you for the report and feedback.
I tried multiple times without any luck on reproducing. Do you have any reproducible test case to reproduce this issue reliably and consistently? 
Moreover, I would like to suggest you upgrading to 8.0.34 as MySQL 5.7 will be EOL this year (Oct 2023).

If you can provide more information, feel free to add it to this bug and change the status back to 'Open'.  Thank you.

Sincerely,
Umesh