Bug #97207 MySQL Group Replication Old incarnation
Submitted: 14 Oct 2019 1:18 Modified: 14 Oct 2019 10:20
Reporter: cui jacky Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.7.26, 5.7.27 OS:Any
Assigned to: CPU Architecture:Any
Tags: MySQL Group Replication Old incarnation
Triage: Needs Triage: D5 (Feature request)

[14 Oct 2019 1:18] cui jacky
Description:
mysql ha environment   have three node Group Membership 
a member to go offline for a short time, then attempt to rejoin the group again. not success.

1)mysql_error.log detail:

2019-10-10T21:06:54.560209+08:00 0 [Note] Plugin group_replication reported: 'Old incarnation found while trying to add node 21.106.97.239:33061 15707128145198210.'
2019-10-10T21:08:04.462954+08:00 0 [Note] Plugin group_replication reported: 'Old incarnation found while trying to add node 21.106.97.239:33061 15707128844370170.'
2019-10-10T21:09:14.443355+08:00 0 [Note] Plugin group_replication reported: 'Old incarnation found while trying to add node 21.106.97.239:33061 15707129544133390.'
2019-10-10T21:10:24.333606+08:00 0 [Note] Plugin group_replication reported: 'Old incarnation found while trying to add node 21.106.97.239:33061 15707130243001550.'
2019-10-10T21:11:34.675202+08:00 0 [Note] Plugin group_replication reported: 'Old incarnation found while trying to add node 21.106.97.239:33061 15707130946302240.'
2019-10-10T21:12:44.577184+08:00 0 [Note] Plugin group_replication reported: 'Old incarnation found while trying to add node 21.106.97.239:33061 15707131645508930.'

2)pstack find XCom check:

Thread 37 (Thread 0x2b6a1733f700 (LWP 19245)):
#0  0x00002b6873be0995 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00002b6a14784e43 in Gcs_xcom_engine::process (this=0x2b6a441084a0) at /export/home/pb2/build/sb_0-33648028-1555164244.06/mysql-5.7.26/rapid/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_notification.cc:210
#2  0x00002b6a14784ec9 in process_notification_thread (ptr_object=<optimized out>) at /export/home/pb2/build/sb_0-33648028-1555164244.06/mysql-5.7.26/rapid/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_notification.cc:159
#3  0x00002b6873bdce25 in start_thread () from /lib64/libpthread.so.0

4)MySQL 5.7 Reference Manual
 from MySQL 5.7.22, servers are given a unique identifier when they join a group. This enables Group Replication to be aware of the situation where a new incarnation of the same server (with the same address but a new identifier) is trying to join the group while its old incarnation is still listed as a member. The new incarnation is blocked from joining the group until the old incarnation can be removed by a reconfiguration. If Group Replication is stopped and restarted on the server, the member becomes a new incarnation and cannot rejoin until the suspicion times out

5)find this variables "group_replication_member_expel_timeout"  can  fix this problem
 【8.0 have】https://dev.mysql.com/doc/refman/8.0/en/group-replication-options.html#sysvar_group_replic...
 【5.7 no have】https://dev.mysql.com/doc/refman/5.7/en/group-replication-options.html

How to repeat:
The environment:
centos7.4
mysql version 5.7.26
member:three 
one master

scene :
  networt problem leave group

Suggested fix:
5.7 version  add  8.0 mgr parameter "group_replication_member_expel_timeout"
or
automic fix this scene problem
[14 Oct 2019 10:20] Umesh Shastry
Hello cui jacky,

Thank you for the report.
Observed while trying this on 5.7.27 build. 
Since this behavior is well documented and node joined the group during my tests but just added a note which you mentioned to the primary's error log so I assume you are requesting for back porting "group_replication_member_expel_timeout" option to 5.7.

regards,
Umesh