Description:
Hello!
On a group replication cluster, if I enable event_scheduler on every node, then create a recurring event, then kill the primary node, when one of the remaining secondaries takes over, the event no longer runs there as it's not ENABLED but rather SLAVESIDE_DISABLED:
dbdeployer deploy replication --topology=group /opt/mysql/8.0.29 --port=28758 --single-primary
[COR-SER] [marcos.albe@tp-support03 group_sp_msb_8_0_29]$ ./check_nodes
# Node 1 # select * from performance_schema.replication_group_members
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 00044659-1111-1111-1111-111111111111 | 127.0.0.1 | 44659 | ONLINE | PRIMARY | 8.0.29 | XCom |
| group_replication_applier | 00044660-2222-2222-2222-222222222222 | 127.0.0.1 | 44660 | ONLINE | SECONDARY | 8.0.29 | XCom |
| group_replication_applier | 00044661-3333-3333-3333-333333333333 | 127.0.0.1 | 44661 | ONLINE | SECONDARY | 8.0.29 | XCom |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
[COR-SER] [marcos.albe@tp-support03 group_sp_msb_8_0_29]$ ./use_all -e "SET GLOBAL event_scheduler=ON; SELECT @@global.event_scheduler;"
# server: 1
@@global.event_scheduler
ON
# server: 2
@@global.event_scheduler
ON
# server: 3
@@global.event_scheduler
ON
~$ cat /tmp/event.sql
USE test;
DROP TABLE totals;
DROP EVENT e_totals;
CREATE TABLE test.totals (v TIMESTAMP(6) NOT NULL PRIMARY KEY, exec_host char(64) NOT NULL);
CREATE EVENT e_totals ON SCHEDULE every 5 second DO INSERT INTO test.totals VALUES (NOW(), @@server_uuid);
SET GLOBAL event_scheduler=ON;
~$ cat /tmp/check.sql
SELECT * FROM test.totals;
SELECT EVENT_NAME,STATUS FROM information_schema.events\G
SELECT MEMBER_ID, MEMBER_STATE, MEMBER_ROLE FROM performance_schema.replication_group_members\G
On the original primary:
[COR-SER] [marcos.albe@tp-support03 group_sp_msb_8_0_29]$ ./n1 < /tmp/check.sql
v exec_host
2022-09-07 21:37:49.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:37:54.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:37:59.000000 00044659-1111-1111-1111-111111111111
*************************** 1. row ***************************
EVENT_NAME: e_totals
STATUS: ENABLED
*************************** 1. row ***************************
MEMBER_ID: 00044659-1111-1111-1111-111111111111
MEMBER_STATE: ONLINE
MEMBER_ROLE: PRIMARY
*************************** 2. row ***************************
MEMBER_ID: 00044660-2222-2222-2222-222222222222
MEMBER_STATE: ONLINE
MEMBER_ROLE: SECONDARY
*************************** 3. row ***************************
MEMBER_ID: 00044661-3333-3333-3333-333333333333
MEMBER_STATE: ONLINE
MEMBER_ROLE: SECONDARY
Before failover, on secondary:
[COR-SER] [marcos.albe@tp-support03 group_sp_msb_8_0_29]$ ./n2 < /tmp/check.sql
v exec_host
2022-09-07 21:37:49.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:37:54.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:37:59.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:38:04.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:38:09.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:38:14.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:38:19.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:38:24.000000 00044659-1111-1111-1111-111111111111
*************************** 1. row ***************************
EVENT_NAME: e_totals
STATUS: SLAVESIDE_DISABLED
*************************** 1. row ***************************
MEMBER_ID: 00044659-1111-1111-1111-111111111111
MEMBER_STATE: ONLINE
MEMBER_ROLE: PRIMARY
*************************** 2. row ***************************
MEMBER_ID: 00044660-2222-2222-2222-222222222222
MEMBER_STATE: ONLINE
MEMBER_ROLE: SECONDARY
*************************** 3. row ***************************
MEMBER_ID: 00044661-3333-3333-3333-333333333333
MEMBER_STATE: ONLINE
MEMBER_ROLE: SECONDARY
Kill original primary
[COR-SER] [marcos.albe@tp-support03 group_sp_msb_8_0_29]$ cat node1/data/*pid
2653599
[COR-SER] [marcos.albe@tp-support03 group_sp_msb_8_0_29]$ kill -9 2653599
Check again on n2 (new primary):
[COR-SER] [marcos.albe@tp-support03 group_sp_msb_8_0_29]$ ./n2 < /tmp/check.sql
v exec_host
2022-09-07 21:38:54.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:38:59.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:39:04.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:39:09.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:39:14.000000 00044659-1111-1111-1111-111111111111
*************************** 1. row ***************************
EVENT_NAME: e_totals
STATUS: SLAVESIDE_DISABLED
*************************** 1. row ***************************
MEMBER_ID: 00044660-2222-2222-2222-222222222222
MEMBER_STATE: ONLINE
MEMBER_ROLE: PRIMARY
*************************** 2. row ***************************
MEMBER_ID: 00044661-3333-3333-3333-333333333333
MEMBER_STATE: ONLINE
MEMBER_ROLE: SECONDARY
Wait a bit and check again to make sure nothing changed:
[COR-SER] [marcos.albe@tp-support03 group_sp_msb_8_0_29]$ sleep 10;
[COR-SER] [marcos.albe@tp-support03 group_sp_msb_8_0_29]$ ./n2 < /tmp/check.sql
v exec_host
2022-09-07 21:38:54.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:38:59.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:39:04.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:39:09.000000 00044659-1111-1111-1111-111111111111
2022-09-07 21:39:14.000000 00044659-1111-1111-1111-111111111111
*************************** 1. row ***************************
EVENT_NAME: e_totals
STATUS: SLAVESIDE_DISABLED
*************************** 1. row ***************************
MEMBER_ID: 00044660-2222-2222-2222-222222222222
MEMBER_STATE: ONLINE
MEMBER_ROLE: PRIMARY
*************************** 2. row ***************************
MEMBER_ID: 00044661-3333-3333-3333-333333333333
MEMBER_STATE: ONLINE
MEMBER_ROLE: SECONDARY
You can see they are still disabled.
How to repeat:
See above for step by step
Suggested fix:
The natural thing would for the event's status to match with member's role, so that if member is the PRIMARY the event would be ENABLED and if member transitions to secondary the events would be DISABLED.
Perhaps having an "auto_status" boolean indicating whether the event's status should be managed by GR would add extra coolness points :D