Bug #98863 No refresh of metadata when primary changes and gr notifications are used
Submitted: 6 Mar 2020 16:22 Modified: 28 May 2020 21:39
Reporter: Florian Apolloner Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Router Severity:S2 (Serious)
Version:8.019 OS:CentOS (Centos 8.1)
Assigned to: CPU Architecture:x86
Tags: group_replication, notifications, router

[6 Mar 2020 16:22] Florian Apolloner
Description:
I have configured an InnoDB Cluster in single-primary mode (also version 8.0.19). When forcing a change of the primary (r/w node) through mysqlsh via "cluster.setPrimaryInstance('user@new_rw_node:3306')" mysqlrouter fails to pick up this change via group notifications.

The mysqlrouter is configured to use ttl=60 and use_gr_notifications=1. I have attached a commented logfile in https://gist.github.com/apollo13/43cbc99aa443226e46fbb7b72d82125e

To verify the bug I have a simple program running that connects to the router every two seconds and requests an R/W connection.

In lines 4-15 you can see that the current master is centos2. Around line 17 I am switching the primary via mysqlsh.

Lines 19-29 show that the cluster notifies the router about MEMBER_ROLE_CHANGE (type=3). Imo this should trigger a metadata refresh but it does not because the view id did not change ( https://github.com/mysql/mysql-server/blob/ea7d2e2d16ac03afdd9cb72a972a95981107bf51/router... ).

It takes till line 97 after the metadata refresh due to TTL timeout that the new primary is queried.

I am not really sure if that is a bug in mysqlrouter or mysqld

How to repeat:
 - Install an innodb cluster.
 - Configure one mysqlrouter with --conf-use-gr-notifications, a log level of debug and start it (or simply change an existing config to ttl=60 and use_gr_notifications=1 + a router restart).
 - tail the mysqlrouter log file
 - Execute a query every two seconds, in my case I executed via (I wanted to test how SUPER_READ_ONLY changes):

watch 'echo "SELECT @@global.SUPER_READ_ONLY;" | mysql -h 127.0.0.1 -u replication_admin -P 6446'

 - Force a primary change via dba.getCluster().setPrimaryInstance('user@new_rw_node:3306') in mysqlsh

 - Wait for the router to react…
[11 Mar 2020 10:53] MySQL Verification Team
Hello Florian,

Thank you for the report and steps.
Verified as described with 3 node cluster (8.0.19) and MySQL router version 8.0.19.

regards,
Umesh
[28 May 2020 21:39] Philip Olson
Posted by developer:
 
Fixed as of the upcoming MySQL Router 8.0.21 release, and here's the proposed changelog entry from the documentation team:

Router assumed that each new GR change notified by X Protocol notifications
has a new view id, but that is not always the case; for example, for
changes like switching the primary or change of the role. The view id is
no longer used for notification debouncing.

Thank you for the bug report.