Bug #89195 | Total ordering of transactions is not respected in Group Replication. | ||
---|---|---|---|
Submitted: | 11 Jan 2018 19:15 | Modified: | 25 Jul 2019 14:45 |
Reporter: | Jean-François Gagné | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Group Replication | Severity: | S2 (Serious) |
Version: | 5.7.20, 8.0.3 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[11 Jan 2018 19:15]
Jean-François Gagné
[15 Jan 2018 12:40]
MySQL Verification Team
Hello Jean, Thank you for the report and detailed steps. Thanks, Umesh
[15 Jan 2018 12:40]
MySQL Verification Team
Taken from Bug #89194
Attachment: 89194_5.7.20.results (application/octet-stream, text), 23.38 KiB.
[15 Jan 2018 18:12]
Nuno Carvalho
Posted by developer: Hi Jean-François, Thank you for your detailed analysis of Group Replication. Group Replication ensures that all servers receive and certify the same set of transactions in the same order, from that point on, on multi-primary mode, the apply of transactions may not respect the certification order if and only if that does not break consistency[1]. From that moment onwards, a local transaction commit may be released as soon as the transaction is certified. Remote transactions need to be applied. This may lead to transactions being *externalized* in a slight different order. On single primary mode, there is a small chance that concurrent and non-contending local transactions are committed and externalized in a different order than that set by PAXOS. This is not problematic, since such execution histories are still consistent and valid. > Secondaries will commit in the same order, given that they observe the total order defined by PAXOS because they run with slave_preserve_commit_order set. Although this does not break consistency, it may lead to a slightly different, but valid, externalization order for a set of concurrent transactions committing together on the primary and eventually applied to the secondaries. We will update the documentation with these low level details. Thanks for your interest on this subject. [1] Unless there is a bug, and you did found one: BUG#89194: Wrong certification lead to data inconsistency and GR breakage which is a duplicate of BUG#86078: Bad Write Set tracking with UNIQUE KEY on a DELETE followed by an INSERT On your example, certification is failing to detect a conflict and that breaks the consistency. Best regards, Nuno Carvalho
[25 Jul 2019 14:45]
Margaret Fisher
Posted by developer: Thanks for raising this! Sorry it didn't get handled earlier. I've added the following explanation to https://dev.mysql.com/doc/refman/5.7/en/group-replication-summary.html instead of the sentence you quoted about applying the transactions in the same order: For applying and externalizing the certified transactions, Group Replication permits servers to deviate from the agreed order of the transactions if this does not break consistency and validity. Group Replication is an eventual consistency system, meaning that as soon as the incoming traffic slows down or stops, all group members have the same data content. While traffic is flowing, transactions can be externalized in a slightly different order, or externalized on some members before the others. For example, in multi-primary mode, a local transaction might be externalized immediately following certification, although a remote transaction that is earlier in the global order has not yet been applied. This is permitted when the certification process has established that there is no conflict between the transactions. In single-primary mode, on the primary server, there is a small chance that concurrent, non-conflicting local transactions might be committed and externalized in a different order from the global order agreed by Group Replication. On the secondaries, which do not accept writes from clients, transactions are always committed and externalized in the agreed order.