Bug #109434 Xa transaction recovery issue after mysql restarts after crash
Submitted: 20 Dec 2022 7:52 Modified: 26 Dec 2022 7:33
Reporter: cong yang Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: XA transactions Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[20 Dec 2022 7:52] cong yang
Description:
Mysql crashed while there is a xa transaction who is executing "xa commit" after the stage "sync binlog" and before the stage "engine commit", and in the same time "flush binary logs" is triggered. 
After Mysql restarts, this xa transaction is still in the "prepared" state in innodb engine, but there is already "xa commit" event related to this xa transaction in the binlog file. It is not consistent between innodb engine and binlog. 

How to repeat:
Session 1:
xa start 'tst';
insert into t1 values(1);
xa end 'tst';
xa prepare 'tst';

SET DEBUG_SYNC= 'bgc_after_sync_stage_before_commit_stage SIGNAL flush_binlog';
send xa commit 'tst';

Session 2:
SET DEBUG_SYNC= 'now WAIT_FOR flush_binlog';
send flush binary logs;

Session 3:
sleep 1; // wait for session 2 who executs "flush binary logs" 
--let $_kill_signal= 9
--source include/send_kill_to_mysqld.inc
--source include/start_mysqld.inc
--source include/wait_until_connected_again.inc
[20 Dec 2022 14:10] MySQL Verification Team
Hi Mr. yang,

Thank you for your bug report.

However, your test case is not applicable. It is a mixture of the command-line commands and mysql-test commands.

Please, send us a test case in either of the two formats, but not in the mixture of it.

Waiting on your feedback.
[21 Dec 2022 2:58] cong yang
Hi,

Thansk for the quick response.

Here is the format of Command line that how to reproduce this issue:
1. First, start the mysql server using the debug binary(8.0.30) with argument '--debug-sync-timeout=1000'
2. Open terminal one, input the command as follow:
create table t1(c1 int);
xa start 'tst';
insert into t1 values(1);
xa end 'tst';
xa prepare 'tst';
SET DEBUG_SYNC= 'bgc_after_sync_stage_before_commit_stage WAIT_FOR kill_mysqld';
xa commit 'tst';
3. Open terminal two, input the command as follow:
flush binary logs;
4. After 1 second, send SIGKILL to this mysql server
5. Restart the mysql server, then we can see the tst xa transaction is in prepared state, and in the rotated binlog, there is already "xa commit 'tst'" event, which is not consistent
[21 Dec 2022 12:47] MySQL Verification Team
Hi Mr. yang,

We have been able to repeat the behaviour that you are reporting.

However, this is a minor bug, since you are using XA transactions on a single server, without XA Manager, which implies that you do not need XA in the first place.

However, it is still a bug.

Verified as reported.
[26 Dec 2022 7:23] cong yang
Hi,

Actually, we do have a XA Coodinator(Manager) and multiple mysql servers as XA Nodes. We use XA Transaction to guarantee the atomicity of distributed transaction。

And this issue blocks the high availability when we use XA transaction. 

Could you please support a patch for this issue?

Best Regards
[26 Dec 2022 7:33] cong yang
Hi,

I find another two similar issues as follow:

A. Here is the format of Command line that how to reproduce this issue:
1. First, start the mysql server using the debug binary(8.0.30) with argument '--debug-sync-timeout=1000'
2. Open terminal one, input the command as follow:
create table t1(c1 int);
xa start 'tst';
insert into t1 values(1);
xa end 'tst';
xa prepare 'tst';
SET DEBUG_SYNC= 'trx_commit_for_mysql_checks_for_aborted WAIT_FOR kill_mysqld';
xa commit 'tst';
3. Open terminal two, input the command as follow:
flush binary logs;
4. After 1 second, send SIGKILL to this mysql server
5. Restart the mysql server, then we can see the tst xa transaction is in prepared state, and in the rotated binlog, there is already "xa commit 'tst'" event, which is not consistent

B.Here is the format of Command line that how to reproduce this issue:
0. Add a debug sync point "trx_rollback_for_mysql_checks_for_aborted" in the first line of function "dberr_t trx_rollback_for_mysql(trx_t *trx)". Then compile the mysqld. 
1. Start the mysql server using the debug binary(8.0.30) with argument '--debug-sync-timeout=1000' containing the modify above. 
2. Open terminal one, input the command as follow:
create table t1(c1 int);
xa start 'tst';
insert into t1 values(1);
xa end 'tst';
xa prepare 'tst';
SET DEBUG_SYNC= 'trx_rollback_for_mysql_checks_for_aborted WAIT_FOR kill_mysqld';
xa rollback 'tst';
3. Open terminal two, input the command as follow:
flush binary logs;
4. After 1 second, send SIGKILL to this mysql server
5. Restart the mysql server, then we can see the tst xa transaction is in prepared state, and in the rotated binlog, there is already "xa rollback 'tst'" event, which is not consistent
[3 Jan 2023 12:40] MySQL Verification Team
Hi Mr. yang,

Thank you for your contribution.

And, yes, we could accept your patch for this bug, so please feel free to post it.
[6 Jan 2023 8:10] huahua xu
The number of prepared XIDs must be 0 before rotating binlog file, so the XA commit or rollback gets marked as xid-requiring.

Attachment: xa_transaction_recovery_issue_after_crash_bugfix.patch (application/octet-stream, text), 641 bytes.

[9 Jan 2023 12:38] MySQL Verification Team
Mr. xu,

Thank you for your contribution.