Bug #102661 xa prepare failed lead replication error
Submitted: 19 Feb 2021 11:20 Modified: 19 Feb 2021 12:04
Reporter: phoenix Zhang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: XA transactions Severity:S3 (Non-critical)
Version:8.0.22, 8.0.23, 5.7.33 OS:Any
Assigned to: CPU Architecture:Any

[19 Feb 2021 11:20] phoenix Zhang
Description:
If xa_prepare failed in storage engine, it may lead data inconsistent between master and slave, and will also lead replication error 

How to repeat:
add file mysql-test/suite/rpl/t/xa_prepare_error_rpl_inconsistent.combinations

[row]
binlog-format=row

add test-file mysql-test/suite/rpl/t/xa_prepare_error_rpl_inconsistent.test

source include/master-slave.inc;

--echo [connect master]
connection master;
CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1,1);
XA START '1';
INSERT INTO t1 VALUES (2,2);
XA END '1';
SET DEBUG='+d,simulate_xa_failure_prepare_in_engine';
--error ER_XA_RBROLLBACK
XA PREPARE '1';
SET DEBUG='-d,simulate_xa_failure_prepare_in_engine';
XA RECOVER;

--source include/rpl_sync.inc
--echo [connect slave]
connection slave;
XA RECOVER;

--echo [connect master]
connection master;
XA START '1';
INSERT INTO t1 VALUES (3,3);
XA END '1';
XA PREPARE '1';
XA COMMIT '1';

sleep 2;
--echo [connect slave]
connection slave;
query_vertical SHOW REPLICA STATUS;
XA RECOVER;

run testcase, slave server will be error:

mysql-test/mtr xa_prepare_error_rpl_inconsistent --nocheck-testcases

include/master-slave.inc
Warnings:
Note	####	Sending passwords in plain text without SSL/TLS is extremely insecure.
Note	####	Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
[connection master]
[connect master]
CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1,1);
XA START '1';
INSERT INTO t1 VALUES (2,2);
XA END '1';
SET DEBUG='+d,simulate_xa_failure_prepare_in_engine';
XA PREPARE '1';
ERROR XA100: XA_RBROLLBACK: Transaction branch was rolled back
SET DEBUG='-d,simulate_xa_failure_prepare_in_engine';
XA RECOVER;
formatID	gtrid_length	bqual_length	data
include/rpl_sync.inc
[connect slave]
XA RECOVER;
formatID	gtrid_length	bqual_length	data
1	1	0	1
[connect master]
XA START '1';
INSERT INTO t1 VALUES (3,3);
XA END '1';
XA PREPARE '1';
XA COMMIT '1';
[connect slave]
SHOW REPLICA STATUS;
Replica_IO_State	Waiting for master to send event
Source_Host	127.0.0.1
Source_User	root
Source_Port	13000
Connect_Retry	1
Source_Log_File	master-bin.000001
Read_Source_Log_Pos	1599
Relay_Log_File	slave-relay-bin.000002
Relay_Log_Pos	1261
Relay_Source_Log_File	master-bin.000001
Replica_IO_Running	Yes
Replica_SQL_Running	No
Replicate_Do_DB	
Replicate_Ignore_DB	
Replicate_Do_Table	
Replicate_Ignore_Table	
Replicate_Wild_Do_Table	
Replicate_Wild_Ignore_Table	
Last_Errno	1440
Last_Error	Error 'XAER_DUPID: The XID already exists' on query. Default database: 'test'. Query: 'XA START X'31',X'',1'
Skip_Counter	0
Exec_Source_Log_Pos	1044
Relay_Log_Space	2025
Until_Condition	None
Until_Log_File	
Until_Log_Pos	0
Source_SSL_Allowed	No
Source_SSL_CA_File	
Source_SSL_CA_Path	
Source_SSL_Cert	
Source_SSL_Cipher	
Source_SSL_Key	
Seconds_Behind_Source	NULL
Source_SSL_Verify_Server_Cert	No
Last_IO_Errno	0
Last_IO_Error	
Last_SQL_Errno	1440
Last_SQL_Error	Error 'XAER_DUPID: The XID already exists' on query. Default database: 'test'. Query: 'XA START X'31',X'',1'
Replicate_Ignore_Server_Ids	
Source_Server_Id	1
Source_UUID	5e592b1d-72a3-11eb-837f-c8f7507e5048
Source_Info_File	mysql.slave_master_info
SQL_Delay	0
SQL_Remaining_Delay	NULL
Replica_SQL_Running_State	
Source_Retry_Count	10
Source_Bind	
Last_IO_Error_Timestamp	
Last_SQL_Error_Timestamp	210219 14:12:35
Source_SSL_Crl	
Source_SSL_Crlpath	
Retrieved_Gtid_Set	
Executed_Gtid_Set	
Auto_Position	0
Replicate_Rewrite_DB	
Channel_Name	
Source_TLS_Version	
Source_public_key_path	
Get_Source_public_key	0
Network_Namespace	
XA RECOVER;
formatID	gtrid_length	bqual_length	data
1	1	0	1
[ 50%] rpl.xa_prepare_error_rpl_inconsistent 'row'  [ fail ]  Found warnings/errors in error log file!
        Test ended at 2021-02-19 19:12:38
line
2021-02-19T11:12:35.969408Z 11 [ERROR] [MY-010584] [Repl] Slave SQL for channel '': Error 'XAER_DUPID: The XID already exists' on query. Default database: 'test'. Query: 'XA START X'31',X'',1', Error_code: MY-001440
2021-02-19T11:12:35.969460Z 11 [Warning] [MY-010584] [Repl] Slave: XAER_DUPID: The XID already exists Error_code: MY-001440
SET @@GLOBAL.super_read_only = @original_super_read_only;
^ Found warnings in /home/phoenix/gitlab/myrocks/DEBUG/mysql-test/var/log/mysqld.2.err
ok

 - the logfile can be found in '/home/phoenix/gitlab/myrocks/DEBUG/mysql-test/var/log/rpl.xa_prepare_error_rpl_inconsistent-row/xa_prepare_error_rpl_inconsistent.log'

Suggested fix:
The reason is, when do xa_prepare, it will first flush binlog to file, then do engine prepare. so even engine prepare failed, binlog will still send to slave server.

code in sql/handler.cc

    /* Allow GTID to be read by SE for XA prepare. */
    {    
      Clone_handler::XA_Operation xa_guard(thd);

      /* Prepare binlog SE first, if there. */
      while (ha_info != nullptr && error == 0) { 
        auto ht = ha_info->ht();
        if (ht->db_type == DB_TYPE_BINLOG) {
          error = prepare_one_ht(thd, ht); 
          break;
        }
        ha_info = ha_info->next();
      }    
      /* Prepare all SE other than binlog. */
      ha_info = trn_ctx->ha_trx_info(Transaction_ctx::SESSION);
      while (ha_info != nullptr && error == 0) { 
        auto ht = ha_info->ht();
        error = prepare_one_ht(thd, ht); 
        if (error != 0) { 
          break;
        }
        ha_info = ha_info->next();
      }    
    }
[19 Feb 2021 12:04] MySQL Verification Team
Hello phoenix Zhang!

Thank you for the report.

regards,
Umesh