Bug #92981 | Potential connection leak in group replication | ||
---|---|---|---|
Submitted: | 29 Oct 2018 4:33 | Modified: | 14 Oct 2019 11:15 |
Reporter: | Tony Wen | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Group Replication | Severity: | S2 (Serious) |
Version: | 5.7.22 | OS: | Oracle Linux |
Assigned to: | MySQL Verification Team | CPU Architecture: | Any |
[29 Oct 2018 4:33]
Tony Wen
[29 Oct 2018 4:45]
Tony Wen
mysql error logs
Attachment: mysql-bug-92981-log.zip (application/zip, text), 504.65 KiB.
[29 Oct 2018 17:19]
MySQL Verification Team
Hi Tony, I have seen number of reports of network related issues with MySQL on kubernetes, and with MySQL running on azure. Nothing we can reproduce with regular servers or regular VM's and our binaries. Were you able to reproduce the problem on locally run kubernetes (using minicube for e.g.) or only on some 3rd party VM system (google cloud e.g.)? If yes can you prepare a system that reproduces your problem that I can deploy locally so I can test what's going on as a simple setup I made I can't reproduce this. Thanks Bogdan
[30 Oct 2018 1:03]
Tony Wen
Hi Bogdan, I didn't experience this issue on my local K8S environment. The issue was seen in K8S cluster created with OCI OKE. But, I take a try to reproduce it. Do you know in which situation the primary node gives this notification : "Old incarnation found while trying to add node ..."? I want to simulate similar situation and see if the issue can be reproduced. Thanks! Tony
[14 Oct 2019 11:15]
MySQL Verification Team
Hi, You might want to look at related bug# 97207 This is actually a change from 5.7.22 and it's documented: [quote] It is possible for a member to go offline for a short time, then attempt to rejoin the group again before the failure detection mechanism has detected its failure, and before the group has been reconfigured to remove the member. In this situation, the rejoining member forgets its previous state, but if other members send it messages that are intended for its pre-crash state, this can cause issues including possible data inconsistency. If a member in this situation participates in XCom's consensus protocol, it could potentially cause XCom to deliver different values for the same consensus round, by making a different decision before and after failure. To counter this possibility, from MySQL 5.7.22, servers are given a unique identifier when they join a group. This enables Group Replication to be aware of the situation where a new incarnation of the same server (with the same address but a new identifier) is trying to join the group while its old incarnation is still listed as a member. The new incarnation is blocked from joining the group until the old incarnation can be removed by a reconfiguration. If Group Replication is stopped and restarted on the server, the member becomes a new incarnation and cannot rejoin until the suspicion times out. [/quote] More here https://dev.mysql.com/doc/refman/5.7/en/group-replication-group-membership.html 8.0 has an option https://dev.mysql.com/doc/refman/8.0/en/group-replication-options.html#sysvar_group_replic... to control this, there is no such think in 5.7 for now