Description:
In a 5 nodes cluster, and all set group_replication_start_on_boot=ON. When all nodes shutdown, and restart and the same time, then in each node, execute `STOP GROUP_REPLICATION` statement will block long time.
If persist group_replication_start_on_boot, when restart, it will invoke `plugin_group_replication_start` internal, then it will hold the lv.plugin_running_mutex lock, this is why `stop group_repliation` blocked.
How to repeat:
1. first, use mysqld_safe to start 5 mysql server
./bin/mysqld --defaults-file=my13000.cnf --user=mysql --initialize-insecure
./bin/mysqld --defaults-file=my13001.cnf --user=mysql --initialize-insecure
./bin/mysqld --defaults-file=my13002.cnf --user=mysql --initialize-insecure
./bin/mysqld --defaults-file=my13003.cnf --user=mysql --initialize-insecure
./bin/mysqld --defaults-file=my13004.cnf --user=mysql --initialize-insecure
./bin/mysqld_safe --defaults-file=my13000.cnf --user=mysql &
./bin/mysqld_safe --defaults-file=my13001.cnf --user=mysql &
./bin/mysqld_safe --defaults-file=my13002.cnf --user=mysql &
./bin/mysqld_safe --defaults-file=my13003.cnf --user=mysql &
./bin/mysqld_safe --defaults-file=my13004.cnf --user=mysql &
2. build 5 nodes cluster into group_replication cluster, then, in each node, `set persist group_replication_start_on_boot=on`
connect 13000-13004:
mysql> install plugin group_replication soname 'group_replication.so'; Query OK, 0 rows affected (0.01 sec)
mysql> CHANGE MASTER TO MASTER_USER="root", MASTER_PASSWORD="" FOR CHANNEL "group_replication_recovery";
Query OK, 0 rows affected, 1 warning (0.03 sec)
mysql> reset master;
Query OK, 0 rows affected (0.05 sec)
// only in 13000
mysql> SET GLOBAL group_replication_bootstrap_group=ON;
Query OK, 0 rows affected (0.00 sec)
mysql> start group_replication; Query OK, 0 rows affected (33.75 sec)
mysql> set persist group_replication_start_on_boot=on;
Query OK, 0 rows affected (0.00 sec)
3. the group_replication now work normally
mysql> SELECT * FROM performance_schema.replication_group_members; +---------------------------+--------------------------------------+-----------------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 04828891-12aa-11eb-9a50-c8f7507e5048 | *** | 13004 | ONLINE | SECONDARY | 8.0.21 |
| group_replication_applier | a898ca67-c6fb-11ea-9567-c8f7507e5048 | *** | 13000 | ONLINE | PRIMARY | 8.0.21 |
| group_replication_applier | ad672129-c6fb-11ea-a1ad-c8f7507e5048 | *** | 13001 | ONLINE | SECONDARY | 8.0.21 |
| group_replication_applier | b2da5323-c6fb-11ea-9186-c8f7507e5048 | *** | 13002 | ONLINE | SECONDARY | 8.0.21 |
| group_replication_applier | ffa85f73-12a9-11eb-8f52-c8f7507e5048 | *** | 13003 | ONLINE | SECONDARY | 8.0.21 |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+-------------+----------------+
5 rows in set (0.00 sec)
4. kill all the 5 node, and mysqld_safe will restart
# kill -9 22791 23203 23794 25090 25452
2020-10-20T08:02:04.712271Z mysqld_safe Number of processes running now: 0
2020-10-20T08:02:04.716119Z mysqld_safe mysqld restarted
2020-10-20T08:02:04.723246Z mysqld_safe Number of processes running now: 0
2020-10-20T08:02:04.725689Z mysqld_safe mysqld restarted
2020-10-20T08:02:04.728540Z mysqld_safe Number of processes running now: 0
2020-10-20T08:02:04.730970Z mysqld_safe mysqld restarted
2020-10-20T08:02:04.742896Z mysqld_safe Number of processes running now: 0
2020-10-20T08:02:04.745009Z mysqld_safe mysqld restarted
2020-10-20T08:02:04.762546Z mysqld_safe Number of processes running now: 0
2020-10-20T08:02:04.765004Z mysqld_safe mysqld restarted
5. connect to 13000, stop group_replication will block for long time
mysql> stop group_replication; Query OK, 0 rows affected (26 min 22.48 sec)