| Bug #109615 | Timeout on wait for view after joining group | ||
|---|---|---|---|
| Submitted: | 13 Jan 2023 7:14 | Modified: | 17 Jan 2023 18:22 |
| Reporter: | zetang zeng | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server: Group Replication | Severity: | S3 (Non-critical) |
| Version: | 5.7.40 | OS: | Any |
| Assigned to: | MySQL Verification Team | CPU Architecture: | Any |
[13 Jan 2023 13:44]
MySQL Verification Team
Hi, Can you please let us know whether your have setup all your servers to be fully ACID-compliant ??? Please, read our Manual on how to configure your OS and InnoDB in order to be 100 % ACID compliant. Also, repeat your experiment after full ACID setup is made, including stringest InnoDB log flushing, OS full flushing, fsync / sync , complete disabling of the OS, filesystem and disk caches , etc , etc ....... You are welcome to contact us again, after you have configured ACID compatibility, as described above.
[14 Jan 2023 4:36]
zetang zeng
Any link to 'Manual on how to configure your OS and InnoDB in order to be 100 % ACID compliant.'? And doesn't this question only has to do with Group Replication? why os/ innodb config check?
[17 Jan 2023 18:22]
MySQL Verification Team
Hi, This has nothing to do with ACID, apologies for misinformation, anyhow, this behavior is not a bug, when you do a reboot of all servers you do a full cluster crash and there is no automatic recovery from that. You need to manually recover the system from such scenario. Thank you for using MySQL
[17 Jan 2023 19:05]
MySQL Verification Team
Additional info: - there must be on server capable to bootstrap the group. Look at documentation about "configuring instances" it is explained in details (https://dev.mysql.com/doc/refman/5.7/en/group-replication-configuring-instances.html ) - since you are using mysql shell, you can use: rebootClusterFromCompleteOutage() ( https://dev.mysql.com/doc/mysql-shell/8.0/en/reboot-outage.html ) Thank you for using MySQL

Description: Deploy three nodes cluster and check status of cluster: all is fine. Then, restart all three nodes at the same time, all nodes report log like following and can not auto recover. 2023-01-11T03:41:27.393871Z 0 [ERROR] Plugin group_replication reported: '[GCS] Error on opening a connection to 10.0.0.253:34061 on local port: 34061.' 2023-01-11T03:41:27.393982Z 0 [ERROR] Plugin group_replication reported: '[GCS] Error on opening a connection to 10.0.0.254:34061 on local port: 34061.' 2023-01-11T03:41:27.393989Z 0 [ERROR] Plugin group_replication reported: '[GCS] Error connecting to all peers. Member join failed. Local port: 34061' 2023-01-11T03:41:27.394130Z 0 [Warning] Plugin group_replication reported: 'read failed' 2023-01-11T03:41:27.399035Z 0 [ERROR] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 34061' 2023-01-11T03:41:43.808426Z 105 [Note] Got an error reading communication packets 2023-01-11T03:42:13.809204Z 286 [Note] Got an error reading communication packets 2023-01-11T03:42:27.371795Z 2 [ERROR] Plugin group_replication reported: 'Timeout on wait for view after joining group' 2023-01-11T03:42:27.371820Z 2 [Note] Plugin group_replication reported: 'Requesting to leave the group despite of not being a member' 2023-01-11T03:42:27.371830Z 2 [ERROR] Plugin group_replication reported: '[GCS] The member is leaving a group without being on one.' How to repeat: - Deploy three nodes cluster on ip1, ip2, ip3 - check status: { "clusterName": "myCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.3.158:3406", "ssl": "REQUIRED", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "192.168.3.156:3406": { "address": "192.168.3.156:3406", "memberRole": "SECONDARY", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE", "version": "5.7.39" }, "192.168.3.157:3406": { "address": "192.168.3.157:3406", "memberRole": "SECONDARY", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE", "version": "5.7.39" }, "192.168.3.158:3406": { "address": "192.168.3.158:3406", "memberRole": "PRIMARY", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE", "version": "5.7.39" } }, "topologyMode": "Single-Primary" }, "groupInformationSourceMember": "192.168.3.158:3406" } - Open three terminal and connect to ip1, ip2, ip3 seperately - Copy "systemctl restart mysqld" to three terminal, and restart all mysql at the same time - All nodes restarted but cluster can't auto recover. Suggested fix: Cluster can auto recover