Bug #93978 XA COMMIT success with same xid
Submitted: 18 Jan 2019 9:44 Modified: 22 Jan 2019 14:04
Reporter: phoenix Zhang (OCA) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.7.23 OS:Any
Assigned to: MySQL Verification Team CPU Architecture:Any

[18 Jan 2019 9:44] phoenix Zhang
Description:
In master-slave relation, when open semi-sync, master should wait ACK from slave, then it will commit.

When master doing a XA COMMIT, if it does receive ACK  from slave yet, it will not return. Then, if multiple clients do XA COMMIT with the same xid, all can execute success after slave ok.

Then, all will record binlog. It will lead slave being sql-error later.

How to repeat:
First, close the semi-sync, and do XA PREPARE.

mysql> set global rpl_semi_sync_master_enabled=0;
Query OK, 0 rows affected (0.00 sec)

mysql> xa start '11';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t1 values(0,0);
Query OK, 1 row affected (0.00 sec)

mysql> xa end '11';
Query OK, 0 rows affected (0.00 sec)

mysql> xa prepare '11';
Query OK, 0 rows affected (0.04 sec)

mysql> quit
Bye

Then, open semi-sync:

mysql> set global rpl_semi_sync_master_enabled=1, rpl_semi_sync_master_timeout=31560000000;
Query OK, 0 rows affected (0.00 sec)

And then, open multiple clients, all doing
mysql> xa commit '11';

And all clients will not return. Then, close t he semi-sync again:
mysql> set global rpl_semi_sync_master_enabled=0;
Query OK, 0 rows affected (0.00 sec)

Then, all clients return OK packet. Now, when check the binlog,  all those clients will record binlog.

mysql> show binlog events in 'mysql-bin.000002';
+------------------+------+----------------+-----------+-------------+--------------------------------------------------------------------+
| Log_name         | Pos  | Event_type     | Server_id | End_log_pos | Info                                                               |
+------------------+------+----------------+-----------+-------------+--------------------------------------------------------------------+
| mysql-bin.000002 |    4 | Format_desc    |        12 |         123 | Server ver: 5.7.13-log, Binlog ver: 4                              |
| mysql-bin.000002 |  123 | Previous_gtids |        12 |         194 | e9eca58e-6437-11e6-b93d-d8cb8aee97da:1-7                           |
------
| mysql-bin.000002 |  891 | Gtid           |        12 |         956 | SET @@SESSION.GTID_NEXT= 'e9eca58e-6437-11e6-b93d-d8cb8aee97da:11' |
| mysql-bin.000002 |  956 | Query          |        12 |        1045 | XA START X'3131',X'',1                                             |
| mysql-bin.000002 | 1045 | Table_map      |        12 |        1091 | table_id: 111 (test.t1)                                            |
| mysql-bin.000002 | 1091 | Write_rows     |        12 |        1135 | table_id: 111 flags: STMT_END_F                                    |
| mysql-bin.000002 | 1135 | Query          |        12 |        1222 | XA END X'3131',X'',1                                               |
| mysql-bin.000002 | 1222 | XA_prepare     |        12 |        1260 | XA PREPARE X'3131',X'',1                                           |
| mysql-bin.000002 | 1260 | Gtid           |        12 |        1325 | SET @@SESSION.GTID_NEXT= 'e9eca58e-6437-11e6-b93d-d8cb8aee97da:12' |
| mysql-bin.000002 | 1325 | Query          |        12 |        1415 | XA COMMIT X'3131',X'',1                                            |
| mysql-bin.000002 | 1415 | Gtid           |        12 |        1480 | SET @@SESSION.GTID_NEXT= 'e9eca58e-6437-11e6-b93d-d8cb8aee97da:13' |
| mysql-bin.000002 | 1480 | Query          |        12 |        1570 | XA COMMIT X'3131',X'',1                                            |
| mysql-bin.000002 | 1570 | Gtid           |        12 |        1635 | SET @@SESSION.GTID_NEXT= 'e9eca58e-6437-11e6-b93d-d8cb8aee97da:14' |
| mysql-bin.000002 | 1635 | Query          |        12 |        1725 | XA COMMIT X'3131',X'',1                                            |
| mysql-bin.000002 | 1725 | Gtid           |        12 |        1790 | SET @@SESSION.GTID_NEXT= 'e9eca58e-6437-11e6-b93d-d8cb8aee97da:15' |
| mysql-bin.000002 | 1790 | Query          |        12 |        1880 | XA COMMIT X'3131',X'',1                                            |
+------------------+------+----------------+-----------+-------------+--------------------------------------------------------------------+
28 rows in set (0.00 sec)
[22 Jan 2019 13:31] MySQL Verification Team
Hi,

I don't consider this a bug. XA transactions use "external" control managed by user, you are manipulating semisync so you get intentionally into "wrong state", that is user error, not system error, as, like I said, XA transactions are externally controlled by the user. If you plan to change if you are using semisync or not, close your XA transactions before you do the change.

Thanks
Bogdan
[22 Jan 2019 14:04] phoenix Zhang
However,in the real replication master-slave. When open semi-sync, if slave net-out sometime, then, in may generate binlog like above in master.

SET @@SESSION.GTID_NEXT= 'e9eca58e-6437-11e6-b93d-d8cb8aee97da:11'
XA START X'3131',X'',1
Table_map      table_id: 111 (test.t1)                                            
Write_rows     table_id: 111 flags: STMT_END_F                                    
XA END X'3131',X'',1
XA PREPARE X'3131',X'',1
SET @@SESSION.GTID_NEXT= 'e9eca58e-6437-11e6-b93d-d8cb8aee97da:12'
XA COMMIT X'3131',X'',1
SET @@SESSION.GTID_NEXT= 'e9eca58e-6437-11e6-b93d-d8cb8aee97da:13'
XA COMMIT X'3131',X'',1
SET @@SESSION.GTID_NEXT= 'e9eca58e-6437-11e6-b93d-d8cb8aee97da:14'
XA COMMIT X'3131',X'',1
SET @@SESSION.GTID_NEXT= 'e9eca58e-6437-11e6-b93d-d8cb8aee97da:15'
XA COMMIT X'3131',X'',1   

After a while, when the slave net-in again, it will dump above binlog events, and apply in slave itself.
When it execute the binlog events in "e9eca58e-6437-11e6-b93d-d8cb8aee97da:12", it will do XA COMMIT, this time it will be ok. However, when it execute the next XA COMMIT in "e9eca58e-6437-11e6-b93d-d8cb8aee97da:13", it will failed now, cause the xa-id does not exist any more.

So, this behavior may lead slave sql-thread error. 

In my opinion,the server should avoid such kind of slave error problem.