| Bug #108426 | Adding instance to a new replica cluster under load results in errors | ||
|---|---|---|---|
| Submitted: | 8 Sep 2022 12:36 | Modified: | 6 Dec 2022 11:06 |
| Reporter: | Jay Janssen | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | Shell AdminAPI InnoDB Cluster / ReplicaSet | Severity: | S3 (Non-critical) |
| Version: | 8.0.30 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[8 Sep 2022 12:37]
Jay Janssen
output log of script setting up new replica clsuter
Attachment: replica-cluster-setup.log (application/octet-stream, text), 6.39 KiB.
[9 Sep 2022 10:30]
MySQL Verification Team
Hi, I think this is not a bug. You had to wait for the first node you added to finish creating before you added a new one (and you could after first one finished) but I will check with GR dev team if there is something else we might report instead of that errror. thanks for your interest in MySQL
[9 Sep 2022 11:15]
Jay Janssen
> I think this is not a bug. You had to wait for the first node you added to finish creating before you added a new one (and you could after first one finished) The create_replica_cluster command completed successfully before I called add_instance, so I feel like I did wait. I am also setting the 'timeout' flag on create_replica_cluster command, which states from you documentation: "timeout: maximum number of seconds to wait for the instance to sync up with the PRIMARY Cluster. Default is 0 and it means no timeout." Given that I am setting that to a high value, isn't it reasonable to expect that after the create_replica_cluster command returns, that the new replica cluster would be synced up with the primary?
[9 Sep 2022 11:28]
MySQL Verification Team
Hi, I discussed this with GR dev team, things should not behave like this. There might be (or was, might be already fixed) a bug in AdminAPI. I'll try to reproduce this and move forward with it. Thanks
[9 Sep 2022 15:27]
Alfredo Kojima
The log you uploaded has some truncated output for cluster.status() at the end, would you happen to still have the full output in your terminal history?
[9 Sep 2022 18:34]
Alfredo Kojima
I was able to reproduce by using a fresh cluster handle from dba.getCluster() called after the replica cluster is created and before adding the secondaries.
[6 Dec 2022 11:06]
Edward Gilmore
Posted by developer: Added the following note to the MySQL Shell 8.0.32 release notes: Attempting to add an instance to a newly-created ReplicaCluster, if the Primary Cluster was under high load, failed with several errors related to super_read_only. This issue was caused by an out-of-date topology view, leading to the newly created ReplicaCluster being considered a standalone cluster. As of this release, createReplicaCluster() synchronizes the metadata update transactions. thereby ensuring it has the correct topology view.

Description: I have a clusterset where the primary cluster is under reasonably heavy write load and I am trying to add a new replica cluster to the cluster set. I have 3 nodes for the new cluster. I am running 'create_replica_cluster' on the first node, which works fine: ``` create_opts={ "recoveryMethod": "clone", "interactive": False, "timeout": 172800, # wait for the new instance to catch up } seed_clusterset.create_replica_cluster(args.standalone , args.name, create_opts) ``` This proceeds normally: ``` Creating InnoDB Cluster 'jaytest-staging-002-usw2' on '10.170.254.87:3306'... Adding Seed Instance... Cluster successfully created. Use Cluster.add_instance() to add MySQL instances. At least 3 instances are needed for the cluster to be able to withstand up to one server failure. * Configuring ClusterSet managed replication channel... ** Changing replication source of 10.170.254.87:3306 to 10.162.254.200:3306 * Waiting for instance '10.170.254.87:3306' to synchronize with PRIMARY Cluster... ** Transactions replicated ############################################################ 100% * Updating topology Replica Cluster 'jaytest-staging-002-usw2' successfully created on ClusterSet 'jaytest-staging-002'. ``` I then immediately try to add another instance to this new cluster using the normal cluster.add_instance, but I get this: ``` Adding instance to the cluster... ERROR: Unable to enable clone on the instance '10.170.254.87:3306': MySQL Error 1290 (HY000): 10.170.254.87:3306: The MySQL server is running with the --super-read-only option so it cannot execute this statement ERROR: Unable to create the Group Replication recovery account: 10.170.254.87:3306: The MySQL server is running with the --super-read-only option so it cannot execute this statement ``` When I try to add the node again later, it seems to work fine. I suspect the issue might have been that the new cluster was behind in replication from the primary cluster. As I stated before, I have write load on the primary cluster and I haven't seen this issue without the load. As you can see, I added the 'timeout' option to the 'create_replica_cluster' call with the intent that it would not return until the new cluster was caught up in replication, but perhaps that doesn't work like I expect. How to repeat: 1) Setup cluster 1 under sysbench load 2) setup clusterset 3) Setup new replica cluster with seed instance 4) Setup additional node right as soon as the replica cluster is up.