MySQL Bugs: #109834: Metadata is wrong after failure to invoke set_primary

Bug #109834	Metadata is wrong after failure to invoke set_primary_instance
Submitted:	30 Jan 2023 6:54	Modified:	28 Feb 2023 17:13
Reporter:	Ken Lau (OCA)	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	Shell AdminAPI InnoDB Cluster / ReplicaSet	Severity:	S3 (Non-critical)
Version:		OS:	Linux
Assigned to:	MySQL Verification Team	CPU Architecture:	Any

Description:
We invoke 'dba.get_replica_set().set_primary_instance()' to switch primary instance and get this error log: 

---
*.*.*.199:13050 will be promoted to PRIMARY of 'set_2041992056301'.
The current PRIMARY is *.*.*.204:1****.

* Connecting to replicaset instances
** Connecting to *.*.*.204:1****
** Connecting to *.*.*.199:1****
** Connecting to *.*.*.205:1****
** Connecting to *.*.*.204:1****
** Connecting to *.*.*.199:1****
** Connecting to *.*.*.205:1****

* Performing validation checks
** Checking async replication topology...
** Checking transaction state of the instance...

* Synchronizing transaction backlog at *.*.*.199:6301

* Updating metadata

* Acquiring locks in replicaset instances
** Pre-synchronizing SECONDARIES
** Acquiring global lock at PRIMARY
** Acquiring global lock at SECONDARIES
[31mERROR: [0m*.*.*.199:6301: GTID sync failed: MYSQLSH 51157: Timeout waiting for replica to synchronize
[31mERROR: [0mAn error occurred while preparing replicaset instances for a PRIMARY switch: Timeout waiting for replica to synchronize
[36mNOTE: [0mReverting metadata changes
[31mERROR: [0m*.*.*.204:6301: Error updating cluster metadata: MySQL Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
[33mWARNING: [0mFailed to revert metadata changes on the PRIMARY: Metadata cannot be updated: MySQL Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
Traceback (most recent call last):
  File "<string>", line 1, in <module>
SystemError: MYSQLSH (51157): ReplicaSet.set_primary_instance: Timeout waiting for replica to synchronize

 err:exit status 1
---

We then invoke 'dba.get_replica_set().status({"extended":2})' and get information below: 

---
{
	"metadataVersion": "2.0.0",
	"replicaSet": {
		"name": "set_2041992056301",
		"primary": "*.*.*.199:****",
		"status": "UNAVAILABLE",
		"statusText": "PRIMARY instance is not available, but there is at least one SECONDARY that could be force-promoted.",
		"topology": {
			"*.*.*.199:****": {
				"address": "*.*.*.199:6301",
				"fenceSysVars": [],
				"fenced": false,
				"instanceErrors": ["ERROR: Instance is a PRIMARY but is replicating from another instance."],
				"instanceRole": "PRIMARY",
				"mode": "R/W",
				"replication": {
					"applierQueuedTransactionSet": "832a2224-72ee-11ed-8368-fa163ee19b6f:143091-143954",
					"applierQueuedTransactionSetSize": 864,
					"applierState": "ON",
					"applierStatus": "APPLYING",
					"applierThreadState": "",
					"applierWorkerThreads": 16,
					"coordinatorState": "ON",
					"options": {
						"connectRetry": 60,
						"delay": 0,
						"heartbeatPeriod": 30,
						"retryCount": 86400
					},
					"receiverStatus": "ON",
					"receiverThreadState": "",
					"receiverTimeSinceLastMessage": "00:00:00.201497",
					"replicationLag": "00:00:53.105207",
					"source": "*.*.*.204:****"
				},
				"serverUuid": "8dcef8b2-72ee-11ed-baaa-fa163e4f622d",
				"status": "ERROR"
			},
			"*.*.*.204:****": {
				"address": "*.*.*.204:6301",
				"fenceSysVars": ["read_only", "super_read_only"],
				"fenced": true,
				"instanceErrors": ["ERROR: 1 errant transaction(s) detected. Topology changes will not be possible until the instance is removed from the replicaset to have the inconsistency repaired.", "ERROR: Replication source channel is not configured, should be *.*.*.199:****."],
				"instanceRole": "SECONDARY",
				"mode": "R/O",
				"replication": {
					"applierStatus": null,
					"expectedSource": "*.*.*.199:****",
					"receiverStatus": null
				},
				"serverUuid": "832a2224-72ee-11ed-8368-fa163ee19b6f",
				"status": "ERROR",
				"transactionSetConsistencyStatus": "INCONSISTENT",
				"transactionSetConsistencyStatusText": "There are 864 transactions that were executed in this instance that did not originate from the PRIMARY.",
				"transactionSetErrantGtidSet": "832a2224-72ee-11ed-8368-fa163ee19b6f:143091-143954"
			},
			"*.*.*.205:****": {
				"address": "*.*.*.205:6301",
				"fenceSysVars": ["read_only", "super_read_only"],
				"fenced": true,
				"instanceErrors": ["ERROR: 1 errant transaction(s) detected. Topology changes will not be possible until the instance is removed from the replicaset to have the inconsistency repaired.", "ERROR: Replication source misconfigured. Expected *.*.*.199:**** but is *.*.*.204:6301."],
				"instanceRole": "SECONDARY",
				"mode": "R/O",
				"replication": {
					"applierQueuedTransactionSet": "",
					"applierQueuedTransactionSetSize": 0,
					"applierState": "ON",
					"applierStatus": "APPLIED_ALL",
					"applierThreadState": "",
					"applierWorkerThreads": 16,
					"coordinatorState": "ON",
					"expectedSource": "*.*.*.199:****",
					"options": {
						"connectRetry": 60,
						"delay": 0,
						"heartbeatPeriod": 30,
						"retryCount": 86400
					},
					"receiverStatus": "ON",
					"receiverThreadState": "",
					"receiverTimeSinceLastMessage": "00:00:00.211088",
					"replicationLag": null,
					"source": "*.*.*.204:****"
				},
				"serverUuid": "82496ed7-72ee-11ed-813c-fa163e7ab701",
				"status": "ERROR",
				"transactionSetConsistencyStatus": "INCONSISTENT",
				"transactionSetConsistencyStatusText": "There are 864 transactions that were executed in this instance that did not originate from the PRIMARY.",
				"transactionSetErrantGtidSet": "832a2224-72ee-11ed-8368-fa163ee19b6f:143091-143954"
			}
		},
		"type": "ASYNC"
	}
}
---

It looks the metadata incorrectly shows *.*.*.199 is the primary while *.*.*.204 is the actual one.

The message：
---
[36mNOTE: [0mReverting metadata changes
[31mERROR: [0m*.*.*.204:6301: Error updating cluster metadata: MySQL Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
---
indicates that the switch procedure was failed to revert metadata changes. Moreover it failed after "** Acquiring global lock at SECONDARIES" was printed.

We then view the code and find out the reason is that it hasn't reverted super_read_only to 0 before reverting metadata.

---
void Global_locks::sync_and_lock_all(
    const std::list<std::shared_ptr<Instance>> &instances,
    const std::string &master_uuid, uint32_t gtid_sync_timeout, bool dry_run) {
  ...
  // Set SRO so that any attempts to do new writes at the PRIMARY error out
  // instead of blocking.
  if (!dry_run) m_master->execute("SET global super_read_only=1");

  if (!dry_run) m_master->execute("FLUSH BINARY LOGS");

  // 3 - synchronize and lock all slaves (in parallel)
  // Sync slaves to master once again and acquire lock
  console->print_info("** Acquiring global lock at SECONDARIES");

  // BUG#30344595: switching primary fails under client load
  // After locking the master and setting it to read-only, we must get the
  // updated (final) GTID_EXECUTED set, since new transactions might have been
  // committed during the pre-sync step, otherwise some trx might fail to be
  // applied on the secondaries (lost).
  master_gtid_set = mysqlshdk::mysql::get_executed_gtid_set(*m_master);

  for (const auto &inst : instances) {
    if (m_master->get_uuid() != inst->get_uuid()) {
      try {
        wait_pending_master_transactions(master_gtid_set, inst.get(),
                                         gtid_sync_timeout);
      } catch (const shcore::Exception &e) {
        console->print_error(shcore::str_format("%s: GTID sync failed: %s",
                                                inst->descr().c_str(),
                                                e.format().c_str()));
        throw;
      }
      ...
    }
  }
  ...
}  
---

How to repeat:
It would happens when payload is high.

Suggested fix:
Revert super_read_only to 0 before reverting metadata.

Hi,

Can you share the full config and mysql config files as I'm having issues reproducing even with high load. Is this the "original" setup, or it was upgraded from something else?

Hi, 
   We use the original setup. In fact, the issue happens with very low probability. We could not reproduce it yet. But the log shows the read_only flag actually prevent the metadata reverting correctly.

Hi,

I cannot reproduce this at all. No matter how high the load is or what I do to the system. 

I will set the bug to "Can't repeat" status but please, if you manage to reproduce this again, update the bug with new findings.

Thanks