Bug #110880 | XA COMMIT at before_commit return fail breaks group replication recovery | ||
---|---|---|---|
Submitted: | 2 May 2023 3:50 | Modified: | 5 May 2023 5:35 |
Reporter: | Zhejun Cai | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Group Replication | Severity: | S2 (Serious) |
Version: | 8.0.32 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[2 May 2023 3:50]
Zhejun Cai
[2 May 2023 13:48]
MySQL Verification Team
Hi Mr. Cai, Thank you for your bug report. However, your test case is non-existent. We need entire test case, so that we can just run it and repeat it. Hence, we need all the tables, their contents and all the commands that you issued and in the correct order. Also, please let us know if the problems repeats only on the standalone server, or in the InnoDB Cluster. If it repeats in the Cluster, we need all the detailed setup of your cluster and what to do exactly, step by step, in order for the bug to surface out. Do also know that our current release is 8.0.33. We are waiting on your full feedback.
[4 May 2023 6:58]
Zhejun Cai
test case
Attachment: gr_xa_commit_failure_before_commit_hook.test (application/octet-stream, text), 4.52 KiB.
[4 May 2023 6:58]
Zhejun Cai
configure for the test case
Attachment: gr_xa_commit_failure_before_commit_hook.cnf (application/octet-stream, text), 130 bytes.
[4 May 2023 7:01]
Zhejun Cai
Hi, I made and uploaded a test case of mysql-8.0.33 for this issue (1) build mysql-server with option WITH_DEBUG=1 (2) run the test case gr_xa_commit_failure_before_commit_hook.test here is the output: ./mtr group_replication.gr_xa_commit_failure_before_commit_hook Logging: ./mtr group_replication.gr_xa_commit_failure_before_commit_hook MySQL Version 8.0.33 Checking supported features - Binaries are debug compiled Using 'all' suites Collecting tests Removing old var directory rpl error summary: SERVER_1:(WORKERS:(CHANNEL:<group_replication_recovery> WORKER:1 ERROR:<Worker 1 failed executing transaction 'b7e0d0b3-ea45-11ed-9fd7-080027f4a265:8' at source log server-binary-log.000001, end_log_pos 2580; Error 'XAER_NOTA: Unknown XID' on query. Default database: 'test'. Query: 'XA COMMIT X'78696431',X'',1'>) COORDINATORS:(CHANNEL:<group_replication_recovery> ERROR:<Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'b7e0d0b3-ea45-11ed-9fd7-080027f4a265:8' at source log server-binary-log.000001, end_log_pos 2580. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.>))
[4 May 2023 12:04]
MySQL Verification Team
Hi Mr. Cai, Please, just confirm whether the problem repeats on stand-alone server ....... From your description, it turns out that it repeats only with InnoDB Cluster. Please, confirm that ......
[4 May 2023 12:12]
MySQL Verification Team
Hi Mr. Cai, We have another problem with your test case. You are executing XA commands on all of the nodes, but. you are not using any XA manager. Can you explain it ......
[4 May 2023 12:19]
MySQL Verification Team
Hi, This is InnoDB Cluster / Group Replication issue with XA transaction. Has nothing to do with MySQL Cluster (ndbcluster)
[5 May 2023 3:18]
Zhejun Cai
Hi, According to my understanding of the mysql-server code, standalone server is no problem, in the test case file, it has described the deployment topology # Pre-conditions: # PC1. GR single-primary topology with 3 servers. To be precise, the test case does not execute XA commands on all of the nodes, it executes XA commands on the primary node, other replica nodes executes XA commands by group_replication applier. I agree that this is InnoDB Cluster / Group Replication issue with XA transaction.
[5 May 2023 5:35]
MySQL Verification Team
Hi, Thank you for the test, verified as described. ... 2023-05-05 08:28:34.191020 32 Error MY-010584 Repl Replica SQL for channel 'group_replication_recovery': Worker 1 failed executing transaction '98cf9113-eb05-11ed-9d6c-000c29c354f6:8' at source log server-binary-log.000001, end_log_pos 2576; Error 'XAER_NOTA: Unknown XID' on query. Default database: 'test'. Query: 'XA COMMIT X'78696431',X'',1', Error_code: MY-001397 ... 2023-05-05 08:28:35.103191 38 Error MY-010584 Repl Replica SQL for channel 'group_replication_recovery': Worker 1 failed executing transaction '98cf9113-eb05-11ed-9d6c-000c29c354f6:8' at source log server-binary-log.000001, end_log_pos 2580; Error 'XAER_NOTA: Unknown XID' on query. Default database: 'test'. Query: 'XA COMMIT X'78696431',X'',1', Error_code: MY-001397 ... LAST_ERROR_NUMBER 1397 LAST_ERROR_MESSAGE Worker 1 failed executing transaction '98cf9113-eb05-11ed-9d6c-000c29c354f6:8' at source log server-binary-log.000001, end_log_pos 2580; Error 'XAER_NOTA: Unknown XID' on query. Default database: 'test'. Query: 'XA COMMIT X'78696431',X'',1'