Description:
Scenario: When a server is started with plugin-load='group_replication.so' (nothing extra), any ASNYC channel is not able to start or stop (if started with server). This happens because of "group_replication_start_on_boot=ON(default)". Slave's state shown is "Waiting for the next event in relay log".
Observations here:-
1) Post server start, create a ASYNC channel and execute 'start slave'. It hangs.
2) Post server start (is ASYNC channel is already created) execute 'stop slave'. It hangs.
3) When commands are hung, try to kill the processlist connections. That hangs too occasionally.
4) When commands are hung, execute 'SHUTDOWN'. This closes the socket, but, mysqld server still shows running.
5) Even, 'SET group_replication_start_on_boot=OFF' doesn't help here too post server start.
So, in short we need to give 'KILL -9 <processid>' to stop mysqld server.
====================
mysql> show slave status\G
.....
.....
Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
.....
Slave_SQL_Running_State: Waiting for the next event in relay log
.....
Channel_Name: ch1
.....
mysql> show processlist;
+----+-----------------+-----------+------+---------+------+-----------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------------+-----------+------+---------+------+-----------------------------------------+------------------+
| 4 | system user | | NULL | Query | 90 | Waiting for the next event in relay log | NULL |
| 5 | system user | | NULL | Connect | 90 | Waiting for master update | NULL |
| 6 | event_scheduler | localhost | NULL | Daemon | 90 | Waiting on empty queue | NULL |
| 10 | root | localhost | NULL | Query | 0 | starting | show processlist |
+----+-----------------+-----------+------+---------+------+-----------------------------------------+------------------+
4 rows in set (0.00 sec)
mysql> stop slave; ## HANG HERE
====================
How to repeat:
Steps to repro:-
================
$ mkdir -p mysql-test/var/mysqld.1/data mysql-test/var/log mysql-test/var/tmp/mysqld.1
$ $PWD/bin/mysqld --no-defaults --datadir=$PWD/mysql-test/var/mysqld.1/data --basedir=$PWD --log-error=$PWD/mysql-test/var/log/mysqld.1.err --initialize-insecure --core-file 2>&1 &
$ ./bin/mysqld --defaults-file=./naren_scripts/test_empty.cnf --basedir=$PWD --datadir=$PWD/mysql-test/var/mysqld.1/data --socket=/tmp/mysqld.1.sock --report-host=localhost --log-error=$PWD/mysql-test/var/log/mysqld.1.err --server-id=1 --core-file 2>&1 &
Where,
$ cat ./naren_scripts/test_empty.cnf
[mysqld]
report-host= 127.0.0.1
report-user= root
log-error-verbosity= 3
plugin-dir='/mysql-8.0/plugin_output_directory/'
plugin-load='group_replication.so'
# Server1
log-bin= server1
relay-log= server1-relay-log
server-id= 1
port= 14000
mysqlx-port= 33060
mysqlx-socket= /tmp/mysqlx.1.sock
On server:-
$ ./bin/mysql -uroot -S/tmp/mysqld.1.sock
mysql> select * from performance_schema.replication_group_members;
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | | | NULL | OFFLINE | | |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+
1 row in set (0.04 sec)
mysql> change master to master_host='localhost', master_user='root', master_port=14001 for channel 'ch1';
mysql> start slave; ## This will hang.
Suggested fix:
Workaround:-
Start server with "--loose-group_replication_start_on_boot=OFF", if just plugin-load='group_replication.so' is set.