Bug #100163 | xa commit failed when stop group_replication will lead node error | ||
---|---|---|---|
Submitted: | 9 Jul 2020 0:56 | Modified: | 12 Jan 2022 21:40 |
Reporter: | phoenix Zhang (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Group Replication | Severity: | S2 (Serious) |
Version: | 8.0.18 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | xa group_replication |
[9 Jul 2020 0:56]
phoenix Zhang
[9 Jul 2020 0:57]
phoenix Zhang
test file, need move to mysql-test/suite/group_replication/t
Attachment: gr_xa_commit_failed.test (application/octet-stream, text), 1.42 KiB.
[9 Jul 2020 0:59]
phoenix Zhang
run the test as command: ./mtr group_replication.gr_xa_commit_failed --nocheck-testcase The result output will be: include/group_replication.inc [rpl_server_count=2] Warnings: Note #### Sending passwords in plain text without SSL/TLS is extremely insecure. Note #### Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information. [connection server1] [connect conn1] SELECT * from performance_schema.replication_group_members; CHANNEL_NAME MEMBER_ID MEMBER_HOST MEMBER_PORT MEMBER_STATE MEMBER_ROLE MEMBER_VERSION group_replication_applier 817bd0ac-c17d-11ea-8a29-c8f7507e5048 127.0.0.1 13001 ONLINE PRIMARY 8.0.18 group_replication_applier 817c2845-c17d-11ea-9f93-c8f7507e5048 127.0.0.1 13000 ONLINE PRIMARY 8.0.18 CREATE TABLE t1 (c1 INT NOT NULL PRIMARY KEY, c2 INT); INSERT INTO t1 VALUES (1,1); include/rpl_sync.inc [connect conn1_1] FLUSH LOGS; XA START '1'; INSERT INTO t1 VALUES (2,2); XA END '1'; XA PREPARE '1'; SET SESSION DEBUG='+d,xa_commit_sleep'; XA COMMIT '1'; [connect conn1_2] STOP GROUP_REPLICATION; [connect conn1_1] ERROR HY000: Error on observer while running replication hook 'before_commit'. SET SESSION DEBUG='-d,xa_commit_failed'; XA RECOVER; formatID gtrid_length bqual_length data SELECT * FROM t1; c1 c2 1 1 SHOW BINLOG EVENTS IN 'server-binary-log.000002'; Log_name Pos Event_type Server_id End_log_pos Info server-binary-log.000002 4 Format_desc 1 124 Server ver: 8.0.18-9-debug, Binlog ver: 4 server-binary-log.000002 124 Previous_gtids 1 191 8193a9b0-c17d-11ea-9f93-c8f7507e5048:1-4 server-binary-log.000002 191 Gtid 1 273 SET @@SESSION.GTID_NEXT= '8193a9b0-c17d-11ea-9f93-c8f7507e5048:5' server-binary-log.000002 273 Query 1 364 XA START X'31',X'',1 server-binary-log.000002 364 Table_map 1 409 table_id: 127 (test.t1) server-binary-log.000002 409 Write_rows 1 449 table_id: 127 flags: STMT_END_F server-binary-log.000002 449 Query 1 538 XA END X'31',X'',1 server-binary-log.000002 538 XA_prepare 1 571 XA PREPARE X'31',X'',1 [connect conn1] START GROUP_REPLICATION; [connect conn2] XA RECOVER; formatID gtrid_length bqual_length data 1 1 0 1 SELECT * FROM t1; c1 c2 1 1 XA COMMIT '1'; XA RECOVER; formatID gtrid_length bqual_length data SELECT * FROM t1; c1 c2 1 1 2 2 SELECT * from performance_schema.replication_group_members; CHANNEL_NAME MEMBER_ID MEMBER_HOST MEMBER_PORT MEMBER_STATE MEMBER_ROLE MEMBER_VERSION group_replication_applier 817bd0ac-c17d-11ea-8a29-c8f7507e5048 127.0.0.1 13001 ONLINE PRIMARY 8.0.18 From the result, we can find that node1 leave the group_replication. Error log will be: line 2020-07-09T00:45:51.444850Z 29 [ERROR] [MY-011599] [Repl] Plugin group_replication reported: 'Transaction cannot be executed while Group Replication is stopping.' 2020-07-09T00:45:51.444872Z 29 [ERROR] [MY-010207] [Repl] Run function 'before_commit' in plugin 'group_replication' failed 2020-07-09T00:45:56.204536Z 36 [ERROR] [MY-010584] [Repl] Slave SQL for channel 'group_replication_applier': Error 'XAER_NOTA: Unknown XID' on query. Default database: 'test'. Query: 'XA COMMIT X'31',X'',1', Error_code: MY-001397 2020-07-09T00:45:56.204708Z 36 [Warning] [MY-010584] [Repl] Slave: XAER_NOTA: Unknown XID Error_code: MY-001397 2020-07-09T00:45:56.204769Z 36 [ERROR] [MY-011451] [Repl] Plugin group_replication reported: 'The applier thread execution was aborted. Unable to process more transactions, this member will now leave the group.' 2020-07-09T00:45:56.204973Z 33 [ERROR] [MY-011452] [Repl] Plugin group_replication reported: 'Fatal error during execution on the Applier process of Group Replication. The server will now leave the group.' 2020-07-09T00:45:56.205228Z 33 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.' 2020-07-09T00:45:56.205566Z 40 [ERROR] [MY-011625] [Repl] Plugin group_replication reported: 'Unable to ensure the execution of group transactions received during recovery.' 2020-07-09T00:45:56.205648Z 40 [ERROR] [MY-011620] [Repl] Plugin group_replication reported: 'Fatal error during the incremental recovery process of Group Replication. The server will leave the group.' 2020-07-09T00:45:56.205754Z 40 [Warning] [MY-011645] [Repl] Plugin group_replication reported: 'Skipping leave operation: concurrent attempt to leave the group is on-going.' 2020-07-09T00:45:56.205818Z 40 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.' SET @@GLOBAL.super_read_only = @original_super_read_only; ^ Found warnings in /home/phoenix/gitlab/myrocks/DEBUG/mysql-test/var/log/mysqld.1.err ok
[9 Jul 2020 12:19]
MySQL Verification Team
Hello phoenix Zhang! Thank you for the report. regards, Umesh
[12 Jan 2022 21:40]
Jon Stephens
Fixed in MySQL 8.0.29 by WL#14700. See same for info. Closed.