Bug #86621 GCS report Warning log in a dead loop
Submitted: 8 Jun 2017 11:39 Modified: 15 Nov 2017 18:03
Reporter: Jackin Chan Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S3 (Non-critical)
Version:5.7.18 OS:Linux (CentOS 6.6 x86_64)
Assigned to: CPU Architecture:Any

[8 Jun 2017 11:39] Jackin Chan
Description:
GCS warning log will grow up forever and make mysql error log very huge over a period of time !

Warning log on 10.202.44.227 (one node belong the mgr which run on single primary node and this node is a slave node):

2017-06-01T07:19:34.246980+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.246996+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247028+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247043+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247074+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247090+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247121+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247136+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247167+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247193+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247227+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247243+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247274+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247290+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247320+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247336+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247367+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247382+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247413+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247428+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247459+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247475+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247505+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247522+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247553+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247569+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247599+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247615+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247646+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247661+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247692+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-01T07:19:34.247708+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-01T07:19:34.247739+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'

....more and more

How to repeat:
I don't know how to repeat and I don't know what had cause this.

gr view(replication_group_members) on 10.202.44.227:

mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST   | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| group_replication_applier | 6c41a65b-4115-11e7-802e-d4ae52ab91b3 | 10.202.44.237 |       24801 | UNREACHABLE  |
| group_replication_applier | d207a9ad-4119-11e7-b3dd-90e2bac5d924 | 10.202.44.227 |       24801 | ONLINE       |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
2 rows in set (0.00 sec)

When I run "stop group_replication" on  10.202.44.227:24801, the warning log disappear...

2017-06-08T10:15:53.752731+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-08T10:15:53.752747+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-08T10:15:53.752775+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-08T10:15:53.752790+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-08T10:15:53.752842+08:00 0 [Note] Plugin group_replication reported: 'state 4330 action xa_terminate'
2017-06-08T10:15:53.753751+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-06-08T10:15:53.753771+08:00 0 [Note] Plugin group_replication reported: 'state 4257 action xa_exit'
2017-06-08T10:15:53.753824+08:00 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'
2017-06-08T10:15:53.753838+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-06-08T10:15:53.753985+08:00 0 [Warning] Plugin group_replication reported: 'read failed'
2017-06-08T10:15:59.297678+08:00 114 [Note] Plugin group_replication reported: 'auto_increment_increment is reset to 1'
2017-06-08T10:15:59.297723+08:00 114 [Note] Plugin group_replication reported: 'auto_increment_offset is reset to 1'
2017-06-08T10:15:59.298245+08:00 64 [Warning] Slave SQL for channel 'group_replication_applier': Coordinator thread of multi-threaded slave is being stopped in the middle of assigning a group of events; deferring to exit until the group completion ... , Error_code: 0
2017-06-08T10:15:59.298311+08:00 64 [Note] Slave SQL thread for channel 'group_replication_applier' exiting, replication stopped in log 'FIRST' at position 894
2017-06-08T10:15:59.300937+08:00 61 [Note] Plugin group_replication reported: 'The group replication applier thread was killed'

It seems that the GCS run in a deadloop !
[19 Jun 2017 3:01] Jackin Chan
today we meet this situation again:

> ls -lha mysql.err
-rw-r----- 1 mysql mysql 8.2G Jun 19 10:40 mysql.err

size of error log is 8.2G

> tail -1000f mysql.err

2017-06-19T10:40:00.609398+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-19T10:40:00.609414+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'
2017-06-19T10:40:00.609424+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 0.0.0.0 refused. Address is not in the IP whitelist.'
2017-06-19T10:40:00.609440+08:00 0 [Warning] Plugin group_replication reported: '[GCS] Unable to handle socket descriptor, therefore refusing connection.'

....more and more
[28 Jun 2017 18:49] MySQL Verification Team
Hi,

Tanks for bug report, it is verified as stated. The only solution at the moment is to use logrotate to rotate the log and keep it small. 

Thanks
Bogdan Kecman
[29 Jun 2017 10:47] Tiago Jorge
Posted by developer:
 
At a glance, we could remove half of the log and maintain only "[GCS] Unable to handle socket descriptor, therefore refusing connection."

Nevertheless, we need a reproducible test case to check if the problem is deeper.
[29 Jun 2017 10:56] Tiago Jorge
In order to help on a deeper analysis on this issue, could you please provide more details on your scenario and what are you trying to accomplish? (physical setup, number of nodes, configurations, test procedure, etc)
[3 Jul 2017 2:51] Jackin Chan
I'm sorry I don't know how to repeat this

The group has 3 node, and one of the node's GR config :

#*******************************  Group Replication related settings **********************
transaction_write_set_extraction=XXHASH64
loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
loose-group_replication_start_on_boot=off
loose-group_replication_local_address= "10.202.44.227:24901"
loose-group_replication_group_seeds= "10.202.44.237:24901,10.202.44.225:24901,10.202.44.227:24901"
loose-group_replication_bootstrap_group=off
loose-group_replication_single_primary_mode=true
loose-group_replication_enforce_update_everywhere_checks= false

report_host=10.202.44.227
super_read_only=1
[31 Aug 2017 8:59] Tiago Jorge
Posted by developer:
 
As stated in my previous comment and only by code inspection, we can cut the log by half, but we need a reproducible test case in order to check if the issue has been corrected.

Did the issue happen again?