Bug #76233 | XA prepare is logged ahead of engine prepare | ||
---|---|---|---|
Submitted: | 9 Mar 2015 20:10 | ||
Reporter: | Andrei Elkin | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
Version: | 5.7.7 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[9 Mar 2015 20:10]
Andrei Elkin
[11 Mar 2015 11:54]
Sven Sandberg
Posted by developer: The scope of this bug is to make each of XA PREPARE, XA COMMIT, and XA ROLLBACK be correctly logged and recovered in case there is a crash. The state of the binary log should agree with the state of the binary log, and this should happen automatically. There are four sub-tasks: 1. Fix the logging order of XA PREPARE as reported above. 2. Extend Previous_gtids_log_event with a field containing the list of XA identifiers of all transactions that are in xa-prepare state. 3. Implement a recovery routine for XA PREPARE. 4. Implement a recovery routine for XA COMMIT and XA ROLLBACK.
[11 Mar 2015 12:41]
Sven Sandberg
Posted by developer: Clarification of execution order: - XA PREPARE is first prepared in the engine and then written to the binary log - XA COMMIT is first written to the binary log and then committed in the engine - XA ROLLBACK is first written to the binary log and then committed in the engine XA COMMIT and XA ROLLBACK are already executed in this order. We need to change XA PREPARE to execute in this order. Proposed recovery routine (pseudocode): # This recovery procedure is designed so that it works in the # following corner cases: # - there are two consecutive transactions having the same XID (this # is allowed as long as one commits before the other starts). # - there is a binlog rotation (and possibly even purge) between # XA PREPARE and XA COMMIT/XA ROLLBACK. # - a single transaction involves multiple storage engines. # set of transactions that were in prepare state when the server # stopped, according to the binary log HASH prepared_in_binlog # set of transactions that were committed in the binary log, and not # followed by a prepare of a different transaction with the same xid. HASH committed_in_binlog # set of transactions that were rolled back in the binary log, and not # followed by a prepare of a different transaction with the same xid HASH rolledback_in_binlog # set of transactions that are in prepared state in any storage engine HASH prepared_in_engine = get_prepared_from_engine() for event in binlog: if event.is_previous_gtids_log_event(): prepared_in_binlog.add(event.get_xid_set()) else if event.is_xa_prepare(): prepared_in_binlog.add(event.xid) committed_in_binlog.remove(event.xid) rolledback_in_binlog.remove(event.xid) else if event.is_xa_commit(): prepared_in_binlog.remove(event.xid) committed_in_binlog.add(event.xid) else if event.is_xa_rollback(): prepared_in_binlog.remove(event.xid) rolledback_in_binlog.add(event.xid) # recover from crash between binlog commit and engine commit HASH to_commit = intersection(committed_in_binlog, prepared_in_engine) # recover from crash between binlog rollback and engine rollback HASH to_rollback = intersection(rolledback_in_binlog, prepared_in_engine) # recover from crash between engine prepare and binlog prepare to_rollback.add(prepared_in_engine - prepared_in_binlog) rollback(to_rollback) commit(to_commit)
[28 Aug 2018 7:12]
Ze Yang
The method to fix this bug may be no enough. There’s one problem when two consecutive transactions having the same XID. If the server crashed after engine prepared of the second transaction without ‘XA-PREPARE’ binlog logged(The first transaction with same XID have been committed successfully). When the server do recovery, we would not know whether commit or rollback the prepared transaction.(With the method of bug#76233, the transaction failed executing ‘xa-prepare’ would be committed, but this is wrong). Two cases: 1、 server crash after write xa commit, but the engine not committed. XA START 'xid' WRITE_ROWS XA END 'xid' XA COMMIT 'xid' 2、After the first 'xid' committed, the server start another 'xid' transaction. the server crash after engine prepared, but the binlog not written. So the binlog is same with case 1. XA START 'xid' WRITE_ROWS XA END 'xid' XA COMMIT 'xid' The second case, the engine prepared but the binlog write nothing. So the two cases the binlog content are same, they all have 'XA COMMIT xid_1' and one prepared-only-in-engine. When server recover, the server do not know which case happened, the server do not know to commit or rollback the prepared transaction in engine. To fix this problem, I add XA_preparing_log_event before engine prepare. But this would reduce performance.
[13 Feb 2020 8:24]
Erlend Dahl
Bug#98288 xa commit crash lead mysql replication error was marked as a duplicate.