Bug #76233 XA prepare is logged ahead of engine prepare
Submitted: 9 Mar 2015 20:10
Reporter: Andrei Elkin Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.7.7 OS:Any
Assigned to: CPU Architecture:Any

[9 Mar 2015 20:10] Andrei Elkin
Description:
Fixes to bug11745231 are inaccurate in ordering of two critical to crash-safety operations.
A xa prepared transaction is logged before it was actually prepared in engine.
Therefore in case of a crash after the xa-prepared got logged, the engine may lose
the transaction so at recovery it would be found in binlog but not in engine.

This case should be fixed with calling engine.prepare() first, and logging afterwards.
A typical crash scenario of

  step 1. XA gets prepared in the engine
  step 2. *Crash*
  step 3. Would be logged, but it never happened.

would lead to engine having prepared trx, but binlog not. That fact should
detected in augmented recovery, similarly how it's done to internal XA.
When the xid event, internal or the user's one, is not found the prepared
transaction is rolled back.
However when the user's xid is found this XA is not yet to commit,
as it has to wait for explicit conclusive query.

How to repeat:
See sources code and run failure simulation to witness improper ordering.

Suggested fix:
Correct ordering.
[11 Mar 2015 11:54] Sven Sandberg
Posted by developer:
 
The scope of this bug is to make each of XA PREPARE, XA COMMIT, and XA ROLLBACK be correctly logged and recovered in case there is a crash. The state of the binary log should agree with the state of the binary log, and this should happen automatically.

There are four sub-tasks:

1. Fix the logging order of XA PREPARE as reported above.
2. Extend Previous_gtids_log_event with a field containing the list of XA identifiers of all transactions that are in xa-prepare state.
3. Implement a recovery routine for XA PREPARE.
4. Implement a recovery routine for XA COMMIT and XA ROLLBACK.
[11 Mar 2015 12:41] Sven Sandberg
Posted by developer:
 
Clarification of execution order:
- XA PREPARE is first prepared in the engine and then written to the binary log
- XA COMMIT is first written to the binary log and then committed in the engine
- XA ROLLBACK is first written to the binary log and then committed in the engine

XA COMMIT and XA ROLLBACK are already executed in this order. We need to change XA PREPARE to execute in this order.

Proposed recovery routine (pseudocode):

  # This recovery procedure is designed so that it works in the
  # following corner cases:
  # - there are two consecutive transactions having the same XID (this
  #   is allowed as long as one commits before the other starts).
  # - there is a binlog rotation (and possibly even purge) between
  #   XA PREPARE and XA COMMIT/XA ROLLBACK.
  # - a single transaction involves multiple storage engines.

  # set of transactions that were in prepare state when the server
  # stopped, according to the binary log
  HASH prepared_in_binlog
  # set of transactions that were committed in the binary log, and not
  # followed by a prepare of a different transaction with the same xid.
  HASH committed_in_binlog
  # set of transactions that were rolled back in the binary log, and not
  # followed by a prepare of a different transaction with the same xid
  HASH rolledback_in_binlog
  # set of transactions that are in prepared state in any storage engine
  HASH prepared_in_engine = get_prepared_from_engine()

  for event in binlog:
    if event.is_previous_gtids_log_event():
      prepared_in_binlog.add(event.get_xid_set())
    else if event.is_xa_prepare():
      prepared_in_binlog.add(event.xid)
      committed_in_binlog.remove(event.xid)
      rolledback_in_binlog.remove(event.xid)
    else if event.is_xa_commit():
      prepared_in_binlog.remove(event.xid)
      committed_in_binlog.add(event.xid)
    else if event.is_xa_rollback():
      prepared_in_binlog.remove(event.xid)
      rolledback_in_binlog.add(event.xid)

  # recover from crash between binlog commit and engine commit
  HASH to_commit = intersection(committed_in_binlog, prepared_in_engine)
  # recover from crash between binlog rollback and engine rollback
  HASH to_rollback = intersection(rolledback_in_binlog, prepared_in_engine)
  # recover from crash between engine prepare and binlog prepare
  to_rollback.add(prepared_in_engine - prepared_in_binlog)

  rollback(to_rollback)
  commit(to_commit)
[28 Aug 7:12] Michael Yang
The method to fix this bug may be no enough. There’s one  problem when two consecutive transactions having the same XID. If the server crashed after engine prepared of the second transaction without ‘XA-PREPARE’ binlog logged(The first transaction with same XID have been committed successfully). When the server do recovery, we would not know whether commit or rollback the prepared transaction.(With the method of bug#76233, the transaction failed executing ‘xa-prepare’ would be committed, but this is wrong).

Two cases:

1、 server crash after write xa commit, but the engine not committed.
XA START 'xid'
WRITE_ROWS
XA END 'xid'
XA COMMIT 'xid'

2、After the first 'xid' committed, the server start another 'xid' transaction.
 the server crash after engine prepared, but the binlog not written. So the binlog
is same with case 1.
XA START 'xid'
WRITE_ROWS
XA END 'xid'
XA COMMIT 'xid'

The second case, the engine prepared but the binlog write nothing. So the two cases the binlog content are same, they all have 'XA COMMIT xid_1' and one prepared-only-in-engine. When server recover, the server do not know which case happened, the server do not know to commit or rollback the prepared transaction in engine.

To fix this problem, I add XA_preparing_log_event before engine prepare. But this would reduce performance.