Bug #116247 In the multi-primary architecture, a rule violation for SMR occurs
Submitted: 27 Sep 2024 6:31 Modified: 27 Sep 2024 8:32
Reporter: Bin Wang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S1 (Critical)
Version:all versions OS:Any
Assigned to: CPU Architecture:Any
Tags: State Machine Replication

[27 Sep 2024 6:31] Bin Wang
Description:
In the Group Replication multi-primary architecture, a rule violation for SMR occurs: CT_CERTIFICATION_MESSAGE messages are not placed into the applier queue, leading to a non-uniform processing order. Below is the function handle_certifier_data that prematurely processes CT_CERTIFICATION_MESSAGE messages.

void Plugin_gcs_events_handler::handle_certifier_message(
    const Gcs_message &message) const {
  if (this->applier_module == nullptr) {
    LogPluginErr(ERROR_LEVEL,
                 ER_GRP_RPL_MISSING_GRP_RPL_APPLIER); /* purecov: inspected */
    return;                                           /* purecov: inspected */
  }
  Certifier_interface *certifier =
      this->applier_module->get_certification_handler()->get_certifier();
  const unsigned char *payload_data = nullptr;
  size_t payload_size = 0;
  Plugin_gcs_message::get_first_payload_item_raw_data(
      message.get_message_data().get_payload(), &payload_data, &payload_size);
  if (certifier->handle_certifier_data(payload_data,
                                       static_cast<ulong>(payload_size),
                                       message.get_origin())) {
    LogPluginErr(
        ERROR_LEVEL,
        ER_GRP_RPL_CERTIFIER_MSSG_PROCESS_ERROR); /* purecov: inspected */
  }
}

Handling CT_CERTIFICATION_MESSAGE messages prematurely can lead to inconsistencies in the certification database data that different nodes' OCC rely on, potentially resulting in eventual data inconsistencies. While this problem may not be easy to detect, it is relatively straightforward to reproduce under specific conditions.

How to repeat:
The specific details of reproduction are as follows: in a Group Replication multi-primary scenario, distribute write pressure evenly across all MySQL nodes using a load balancer (such as LVS). Given sufficient write conflicts, it is possible to reproduce inconsistencies in the final state of state machine replication. 

Suggested fix:
Based on extensive testing, placing certification messages into the applier queue for unified processing can eliminate the aforementioned data inconsistency problem.
[27 Sep 2024 6:34] Bin Wang
The key to state-machine replication is that all copies start from the same initial state, transition through the same states, and produce the same outputs. Any deviation from this rule is non-compliant and difficult to detect, often only revealing problems in corner cases. All nodes must execute in the same sequence, with identical transactions and underlying data.