Description:
[Summary]
xid_t::operator< violates strict weak ordering when XA format_id differs, so std::map<XID,…> used during binlog crash recovery can fail find(xid) for an inserted key; a prepared external XA that is already in the binlog may be rolled back incorrectly.
[Description]
Externally coordinated XA state recovered from the binary log is kept in std::map keyed by XID with std::less<XID> (sql/handler.h, Xa_state_list::list; populated in sql/binlog/recovery.cc, Binlog_recovery::m_external_xids). That relies on xid_t::operator< in sql/xa.cc.
The implementation returns true when this->get_format_id() < rhs.get_format_id() but does not return false when this->get_format_id() > rhs.get_format_id(). It then compares gtrid_length / bqual_length / payload. That breaks strict weak ordering (e.g. transitivity), so the map ordering is invalid and find() may miss an element that was inserted.
During crash recovery, xa::recovery::recover_one_external_trx() (sql/xa/recovery.cc) calls xa_list->find(xa_trx.id). If the lookup incorrectly returns NOT_FOUND while the binlog recorded XA PREPARE, the engine rolls the transaction back, contradicting the binlog.
How to repeat:
Requirements: Debug build (--with-debug / CMAKE_BUILD_TYPE=Debug), binary logging enabled, row binlog format for this suite combination. No custom code injection: uses existing CONDITIONAL_SYNC_POINT_FOR_TIMESTAMP("before_set_prepared_in_tc") in sql/tc_log.cc and standard MTR kill_mysqld.inc.
Add the following MTR files under mysql-test/suite/binlog/t/ and mysql-test/suite/binlog/r/ (or attach them to the bug).
Run:
cd mysql-test
./mysql-test-run.pl --suite=binlog --do-test=binlog_xa_external_xids_map_ordering
With the buggy comparator: the row variant fails on the first assertion (no matching error-log note for Successfully prepared 1 XA transaction; InnoDB recovery logs rolling back 1 XA). After fixing operator< as below, the test passes against the recorded .result.
File: suite/binlog/t/binlog_xa_external_xids_map_ordering.test
# ==== Purpose ====
#
# Reproducer for broken strict weak ordering of xid_t::operator< when
# format_id differs: std::map m_external_xids can contain an XID key but
# find() fails (transitivity violated by comparing format_id only on one
# side). Binlog recovery then treats a prepared-in-binlog XA as absent and
# rolls it back.
#
# Setup uses three external XIDs (A,B,C) with:
# A: format 1, gtrid length 50
# B: format 2, gtrid length 5
# C: format 1, gtrid length 10
# so that A<B and B<C in the buggy comparator but C<A.
#
# A and B are fully committed so only C remains prepared after crash.
# Crash at before_set_prepared_in_tc for C (binlog already has XA PREPARE).
#
# Expected recovery: one prepared XA (C), error log reports
# "Successfully prepared 1 XA transaction", two committed rows visible.
# Requires xid_t::operator< to order format_id before gtrid_length (see sql/xa.cc).
#
# ==== References ====
#
# xid_t::operator< in sql/xa.cc
# binlog::Binlog_recovery::m_external_xids in sql/binlog/recovery.cc
#
--source include/not_valgrind.inc
--source include/have_debug.inc
--source include/have_debug_sync.inc
--source include/have_log_bin.inc
--source include/have_binlog_format_row.inc
let $messages = Found .* prepared XA transactions
.*Checksum mismatch in datafile.*;
--source include/suppress_messages.inc
CALL mtr.add_suppression("Statement is unsafe because it is being used inside a XA transaction");
CREATE TABLE t1 (c1 INT PRIMARY KEY);
--let $xid_a = `SELECT CONCAT("X'", REPEAT('aa',50), "',X'',1")`
--let $xid_b = `SELECT CONCAT("X'", REPEAT('bb',5), "',X'',2")`
--let $xid_c = `SELECT CONCAT("X'", REPEAT('cc',10), "',X'',1")`
--connect(con1, localhost, root,,)
--connection con1
--eval XA START $xid_a
INSERT INTO t1 VALUES (1);
--eval XA END $xid_a
--eval XA PREPARE $xid_a
--eval XA COMMIT $xid_a
--connect(con2, localhost, root,,)
--connection con2
--eval XA START $xid_b
INSERT INTO t1 VALUES (2);
--eval XA END $xid_b
--eval XA PREPARE $xid_b
--eval XA COMMIT $xid_b
--connect(con3, localhost, root,,)
--connection con3
--eval XA START $xid_c
INSERT INTO t1 VALUES (3);
--eval XA END $xid_c
--let $auxiliary_connection = default
--let $statement_connection = con3
--let $statement = XA PREPARE $xid_c
--let $sync_point = before_set_prepared_in_tc
--source include/execute_to_conditional_timestamp_sync_point.inc
--source include/kill_mysqld.inc
--source extra/xa_crash_safe_tests/cleanup_connection.inc
--source include/start_mysqld.inc
--let $assert_select = Successfully prepared 1 XA transaction
--source extra/xa_crash_safe_tests/assert_recovery_message.inc
--let $expected_prepared_xa_count = 1
--source extra/xa_crash_safe_tests/assert_xa_recover.inc
--let $expected_row_count = 2
--source extra/xa_crash_safe_tests/assert_row_count.inc
--eval XA ROLLBACK $xid_c
DROP TABLE t1;
Suggested fix:
[Root cause]
In sql/xa.cc, xid_t::operator< must define a strict weak ordering for use in std::map. When this->format_id > rhs.format_id, the code must not fall through to gtrid_length. Counterexample with the buggy comparator:
A: format_id=1, gtrid_len=50
B: format_id=2, gtrid_len=5
C: format_id=1, gtrid_len=10
Then A < B (1<2), B < C (after format check, 5<10), but C < A (10<50) while A < C is false → ordering is not transitive; a std::map can store all three keys yet find(C) fails (reproducible in a small standalone C++ program using the same comparison logic).
[Impact / Severity]
Wrong crash recovery: binlog indicates XA PREPARE / TC state PREPARED_IN_TC, but the server may roll back the prepared XA.
Data loss relative to binlog and inconsistent XA RECOVER vs binlog.
Trigger requires multiple external XAs with different format_id and specific gtrid lengths, plus a crash in the window after binlog flush, before set_prepared_in_tc in engines (same class as existing WL#11300 crash-safe XA tests).
Severity: S2 (Serious) — incorrect, silent recovery path.
[Suggested fix]
After the first format_id comparison, add the symmetric branch so format_id is totally ordered before gtrid_length / bqual_length / memcmp:
bool xid_t::operator<(const xid_t &rhs) const {
if (this->get_format_id() < rhs.get_format_id()) return true;
if (rhs.get_format_id() < this->get_format_id()) return false;
if (this->get_gtrid_length() < rhs.get_gtrid_length()) {
return true;
}
// ... remainder unchanged ...
}
Description: [Summary] xid_t::operator< violates strict weak ordering when XA format_id differs, so std::map<XID,…> used during binlog crash recovery can fail find(xid) for an inserted key; a prepared external XA that is already in the binlog may be rolled back incorrectly. [Description] Externally coordinated XA state recovered from the binary log is kept in std::map keyed by XID with std::less<XID> (sql/handler.h, Xa_state_list::list; populated in sql/binlog/recovery.cc, Binlog_recovery::m_external_xids). That relies on xid_t::operator< in sql/xa.cc. The implementation returns true when this->get_format_id() < rhs.get_format_id() but does not return false when this->get_format_id() > rhs.get_format_id(). It then compares gtrid_length / bqual_length / payload. That breaks strict weak ordering (e.g. transitivity), so the map ordering is invalid and find() may miss an element that was inserted. During crash recovery, xa::recovery::recover_one_external_trx() (sql/xa/recovery.cc) calls xa_list->find(xa_trx.id). If the lookup incorrectly returns NOT_FOUND while the binlog recorded XA PREPARE, the engine rolls the transaction back, contradicting the binlog. How to repeat: Requirements: Debug build (--with-debug / CMAKE_BUILD_TYPE=Debug), binary logging enabled, row binlog format for this suite combination. No custom code injection: uses existing CONDITIONAL_SYNC_POINT_FOR_TIMESTAMP("before_set_prepared_in_tc") in sql/tc_log.cc and standard MTR kill_mysqld.inc. Add the following MTR files under mysql-test/suite/binlog/t/ and mysql-test/suite/binlog/r/ (or attach them to the bug). Run: cd mysql-test ./mysql-test-run.pl --suite=binlog --do-test=binlog_xa_external_xids_map_ordering With the buggy comparator: the row variant fails on the first assertion (no matching error-log note for Successfully prepared 1 XA transaction; InnoDB recovery logs rolling back 1 XA). After fixing operator< as below, the test passes against the recorded .result. File: suite/binlog/t/binlog_xa_external_xids_map_ordering.test # ==== Purpose ==== # # Reproducer for broken strict weak ordering of xid_t::operator< when # format_id differs: std::map m_external_xids can contain an XID key but # find() fails (transitivity violated by comparing format_id only on one # side). Binlog recovery then treats a prepared-in-binlog XA as absent and # rolls it back. # # Setup uses three external XIDs (A,B,C) with: # A: format 1, gtrid length 50 # B: format 2, gtrid length 5 # C: format 1, gtrid length 10 # so that A<B and B<C in the buggy comparator but C<A. # # A and B are fully committed so only C remains prepared after crash. # Crash at before_set_prepared_in_tc for C (binlog already has XA PREPARE). # # Expected recovery: one prepared XA (C), error log reports # "Successfully prepared 1 XA transaction", two committed rows visible. # Requires xid_t::operator< to order format_id before gtrid_length (see sql/xa.cc). # # ==== References ==== # # xid_t::operator< in sql/xa.cc # binlog::Binlog_recovery::m_external_xids in sql/binlog/recovery.cc # --source include/not_valgrind.inc --source include/have_debug.inc --source include/have_debug_sync.inc --source include/have_log_bin.inc --source include/have_binlog_format_row.inc let $messages = Found .* prepared XA transactions .*Checksum mismatch in datafile.*; --source include/suppress_messages.inc CALL mtr.add_suppression("Statement is unsafe because it is being used inside a XA transaction"); CREATE TABLE t1 (c1 INT PRIMARY KEY); --let $xid_a = `SELECT CONCAT("X'", REPEAT('aa',50), "',X'',1")` --let $xid_b = `SELECT CONCAT("X'", REPEAT('bb',5), "',X'',2")` --let $xid_c = `SELECT CONCAT("X'", REPEAT('cc',10), "',X'',1")` --connect(con1, localhost, root,,) --connection con1 --eval XA START $xid_a INSERT INTO t1 VALUES (1); --eval XA END $xid_a --eval XA PREPARE $xid_a --eval XA COMMIT $xid_a --connect(con2, localhost, root,,) --connection con2 --eval XA START $xid_b INSERT INTO t1 VALUES (2); --eval XA END $xid_b --eval XA PREPARE $xid_b --eval XA COMMIT $xid_b --connect(con3, localhost, root,,) --connection con3 --eval XA START $xid_c INSERT INTO t1 VALUES (3); --eval XA END $xid_c --let $auxiliary_connection = default --let $statement_connection = con3 --let $statement = XA PREPARE $xid_c --let $sync_point = before_set_prepared_in_tc --source include/execute_to_conditional_timestamp_sync_point.inc --source include/kill_mysqld.inc --source extra/xa_crash_safe_tests/cleanup_connection.inc --source include/start_mysqld.inc --let $assert_select = Successfully prepared 1 XA transaction --source extra/xa_crash_safe_tests/assert_recovery_message.inc --let $expected_prepared_xa_count = 1 --source extra/xa_crash_safe_tests/assert_xa_recover.inc --let $expected_row_count = 2 --source extra/xa_crash_safe_tests/assert_row_count.inc --eval XA ROLLBACK $xid_c DROP TABLE t1; Suggested fix: [Root cause] In sql/xa.cc, xid_t::operator< must define a strict weak ordering for use in std::map. When this->format_id > rhs.format_id, the code must not fall through to gtrid_length. Counterexample with the buggy comparator: A: format_id=1, gtrid_len=50 B: format_id=2, gtrid_len=5 C: format_id=1, gtrid_len=10 Then A < B (1<2), B < C (after format check, 5<10), but C < A (10<50) while A < C is false → ordering is not transitive; a std::map can store all three keys yet find(C) fails (reproducible in a small standalone C++ program using the same comparison logic). [Impact / Severity] Wrong crash recovery: binlog indicates XA PREPARE / TC state PREPARED_IN_TC, but the server may roll back the prepared XA. Data loss relative to binlog and inconsistent XA RECOVER vs binlog. Trigger requires multiple external XAs with different format_id and specific gtrid lengths, plus a crash in the window after binlog flush, before set_prepared_in_tc in engines (same class as existing WL#11300 crash-safe XA tests). Severity: S2 (Serious) — incorrect, silent recovery path. [Suggested fix] After the first format_id comparison, add the symmetric branch so format_id is totally ordered before gtrid_length / bqual_length / memcmp: bool xid_t::operator<(const xid_t &rhs) const { if (this->get_format_id() < rhs.get_format_id()) return true; if (rhs.get_format_id() < this->get_format_id()) return false; if (this->get_gtrid_length() < rhs.get_gtrid_length()) { return true; } // ... remainder unchanged ... }