Bug #120460 Online Cleanup of Obsolete GTID UUIDs
Submitted: 12 May 19:55
Reporter: Nuno Carvalho Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: Replication Severity:S4 (Feature request)
Version:10.x OS:Any
Assigned to: CPU Architecture:Any

[12 May 19:55] Nuno Carvalho
Description:
Long-running MySQL deployments accumulate Global Transaction Identifier (GTID) state from every server identity that has ever participated in the topology. When servers are frequently replaced, decommissioned, restored, cloned, or moved across environments, their UUIDs can remain permanently represented in `gtid_executed`, `gtid_purged`, and received GTID state even after the corresponding servers and binary logs are no longer operationally relevant. Over time these sets can become large, noisy, and difficult to reason about. This creates a persistent maintenance burden: administrators must keep carrying obsolete GTID history through failovers, backups, restores, topology checks, replica provisioning, and external automation, even though the old identities no longer represent active replication sources.

This feature should provide a safe, persistent, online mechanism for removing obsolete UUIDs from GTID state without breaking replication semantics or tooling that depends on GTID comparisons. The operation should be explicit and protected by privilege checks, should reject removal when the UUID is still in use, and should provide a way to verify across the topology that the operation can succeed before it is executed. Because GTID sets are used by replication, backup, restore, failover, session-consistency, and errant-transaction checks, deletion cannot be a single instantaneous disappearance. The design should preserve correctness through a staged removal process that first records the UUID as being removed, allows that state to replicate and be consumed by dependent components, and only later completes the removal when it is safe.

Without this feature, operators must either tolerate unbounded growth in GTID metadata or perform disruptive manual procedures to rebuild or reset server state. Both choices are costly. Large `gtid_executed` sets increase cognitive load during incident response, make diagnostics harder, complicate automation that compares GTID sets, and can expose edge cases in tools that were not designed for indefinitely growing historical identity lists. The burden is especially high in environments with frequent server rotation or large fleets, where obsolete UUIDs are produced continuously and where operational correctness depends on predictable GTID behavior across many nodes.

The intended result is a controlled lifecycle for retired UUIDs. Administrators should be able to mark a UUID for deletion, observe that deletion state, wait until it is safely known throughout the topology and related operational systems, and then finish the removal. During the entire process, servers should remain online, replication should continue, backups and point-in-time recovery should remain valid, and GTID-based tooling should either continue to work or have well-defined semantics for UUIDs that are being removed. This shifts GTID maintenance from ad hoc operational cleanup to a supported, auditable server feature.

How to repeat:
Please see description.
[27 May 15:43] Nuno Carvalho
Presentation on Oracle MySQL Contributor Summit

Attachment: Controlled_Deletion_of_Obsolete_UUIDs_from_GTID_State.pdf (application/pdf, text), 231.18 KiB.