Bug #84728 Not Possible To Avoid MySQL From Starting When GR Cannot Start
Submitted: 31 Jan 2017 8:22 Modified: 11 Jul 2017 11:08
Reporter: Kenny Gryp Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S2 (Serious)
Version:5.7.17 OS:Any
Assigned to: CPU Architecture:Any

[31 Jan 2017 8:22] Kenny Gryp
Description:

When a node (in non bootstrap mode) cannot join an existing cluster it will eventually timeout and give the following error:

2017-01-27T23:16:14.195665Z 0 [ERROR] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 24901'
2017-01-27T23:17:14.103608Z 3 [ERROR] Plugin group_replication reported: 'Timeout on wait for view after joining group'

After the error/timeout, mysql just starts as a single node. 

How to repeat:
Start mysql with some invalid group_replication_group_seeds ip addresses, GR will timeout and just start mysql.

Suggested fix:
I want to be able to change the timeout, can I somehow? 
Actually, I don't want MySQL to start at all, so I avoid split-brain situations. 

I recommend GR to start with group_replication_start_on_boot=ON to try to make GR behave like a CP system (CAP Theorem). Hence this request.

If MySQL just starts, I would not call GR - Partition Tolerant as it's actually just going to accept reads and writes and create a split brain.
[31 Jan 2017 8:24] Kenny Gryp
Also, during my tests, nothing was reported into the error log of the node that lost Quorum.
[31 Jan 2017 8:24] Kenny Gryp
.
[31 Jan 2017 11:35] Umesh Shastry
Hello Kenny Gryp,

Thank you for the report.

Thanks,
Umesh
[31 Jan 2017 16:21] Nuno Carvalho
Posted by developer:
 
Hi Kenny,

Thank you for your suggestion, we will look into it.

Best regards,
Nuno Carvalho
[3 Feb 2017 14:17] Pedro Gomes
Hi Kenny,

Is 
https://dev.mysql.com/doc/refman/5.7/en/server-plugin-loading.html#server-plugin-activatin...
a possible solution for this? 

In theory the use of the option

 --plugin_name=FORCE

is what you want, no?
[3 Feb 2017 15:33] Kenny Gryp
Hi Pedro,

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

server_id=3
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
relay-log=gr-3-relay-bin
binlog_checksum=NONE
log_slave_updates=ON
log_bin=binlog
binlog_format=ROW
group_replication=FORCE
transaction_write_set_extraction=XXHASH64
group_replication_group_name="da7aba5e-dead-da7a-ba55-da7aba5e57ab"
group_replication_start_on_boot=on
#super_read_only=1
group_replication_local_address= "gr-3:24901"
group_replication_group_seeds= "gr-1:24901,gr-2:24901,gr-3:24901"
group_replication_bootstrap_group= off

2017-02-03T15:29:18.689714Z 0 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.7.17-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)
2017-02-03T15:29:18.805708Z 3 [Note] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2017-02-03T15:29:18.806061Z 3 [Note] Plugin group_replication reported: '[GCS] Added automatically IP ranges 10.0.2.15/24,127.0.0.1/8,192.168.56.4/24 to the whitelist'
2017-02-03T15:29:18.807743Z 3 [Note] Plugin group_replication reported: '[GCS] Translated 'gr-3' to 192.168.56.4'
2017-02-03T15:29:18.807783Z 3 [ERROR] Plugin group_replication reported: '[GCS] There is no local IP address matching the one configured for the local node (gr-3:24901).'
2017-02-03T15:29:18.807798Z 3 [ERROR] Plugin group_replication reported: 'Unable to initialize the group communication engine'
2017-02-03T15:29:18.807803Z 3 [Note] Plugin group_replication reported: 'Requesting to leave the group despite of not being a member'
2017-02-03T15:29:18.807805Z 3 [ERROR] Plugin group_replication reported: 'Error calling group communication interfaces while trying to leave the group'

Group replication failed to start, but the plugin is still loaded:

SHOW PLUGINS;
+----------------------------+----------+--------------------+----------------------+---------+
| Name                       | Status   | Type               | Library              | License |
+----------------------------+----------+--------------------+----------------------+---------+
| binlog                     | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| mysql_native_password      | ACTIVE   | AUTHENTICATION     | NULL                 | GPL     |
| sha256_password            | ACTIVE   | AUTHENTICATION     | NULL                 | GPL     |
| CSV                        | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| MEMORY                     | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| InnoDB                     | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| INNODB_TRX                 | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_LOCKS               | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_LOCK_WAITS          | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMP                 | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMP_RESET           | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMPMEM              | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMPMEM_RESET        | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMP_PER_INDEX       | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMP_PER_INDEX_RESET | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_BUFFER_PAGE         | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_BUFFER_PAGE_LRU     | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_BUFFER_POOL_STATS   | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_TEMP_TABLE_INFO     | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_METRICS             | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_DEFAULT_STOPWORD | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_DELETED          | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_BEING_DELETED    | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_CONFIG           | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_INDEX_CACHE      | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_INDEX_TABLE      | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_SYS_TABLES          | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_SYS_TABLESTATS      | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_SYS_INDEXES         | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_SYS_COLUMNS         | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_SYS_FIELDS          | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_SYS_FOREIGN         | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_SYS_FOREIGN_COLS    | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_SYS_TABLESPACES     | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_SYS_DATAFILES       | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_SYS_VIRTUAL         | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| MyISAM                     | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| MRG_MYISAM                 | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| PERFORMANCE_SCHEMA         | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| ARCHIVE                    | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| BLACKHOLE                  | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| FEDERATED                  | DISABLED | STORAGE ENGINE     | NULL                 | GPL     |
| partition                  | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| ngram                      | ACTIVE   | FTPARSER           | NULL                 | GPL     |
| group_replication          | ACTIVE   | GROUP REPLICATION  | group_replication.so | GPL     |
| validate_password          | ACTIVE   | VALIDATE PASSWORD  | validate_password.so | GPL     |
+----------------------------+----------+--------------------+----------------------+---------+
46 rows in set (0.00 sec)

so unfortunately `group_replication=FORCE` does not work, no beer for you yet :(
[25 May 2017 14:40] Nuno Carvalho
Hi Kenny,

Due to several reasons we cannot make MySQL not start when Group 
Replication start fails, though on a future release, it will be 
possible to start Group Replication on server start (and as a SQL
command) with super_read_only=1. 
Group Replication will only reset super_read_only after a
successful start, that is, if some error does happen, like a network
failure when joining the group, writes are still disabled by
super_read_only=1.

So to protect your scenario, you will need to add
  super_read_only=1
to your servers configuration file.

Best regards,
Nuno Carvalho
[21 Jun 2017 0:11] Kenny Gryp
As additional information on Nuno's comment. The bug related to starting MySQL GR with `super_read_only=1` is https://bugs.mysql.com/bug.php?id=84733. 
This bug mentions that this is fixed in 8.0.2
[21 Jun 2017 11:30] Nuno Carvalho
Hi Kenny,

Both applies to 5.7.

Best regards,
Nuno Carvalho
[11 Jul 2017 11:08] Erlend Dahl
[29 Jun 2017 9:48] David Moss (DMOSS)

Thanks for your feedback, this is fixed in upcoming versions and the
following was added to the 5.7.19 and 8.0.2 change logs:
In the event that a member failed to join a group the member was not stopping
and continued to accept transactions. To avoid this set your members to have
super_read_only=1 in the my.cfg file. Group Replication now checks for this
setting upon successful start up and sets super_read_only=0. This ensures
that members which do not successfully join a group cannot accept
transactions.
----
Additionally the following page was updated to mention the new behavior:

https://dev.mysql.com/doc/refman/5.7/en/group-replication-adding-instances.html
https://dev.mysql.com/doc/refman/8.0/en/group-replication-adding-instances.html