Description:
After a transaction trx1 is asynchronously rolled back by a "high priority" transaction trx2, any statements executed on trax1 gets deadlock error ER_LOCK_DEADLOCK. This is currently true also for rollback statement i.e. ROLLBACK also returns ER_LOCK_DEADLOCK.
This causes issue with replication. After ER_LOCK_DEADLOCK is returned while executing a statement, server attempts complete rollback of the transaction with each storage engine. In above scenario, Innodb returns ER_LOCK_DEADLOCK again when innobase_rollback is called. The error on rollback causes MYSQL_BIN_LOG::rollback to skip some clean up operation. As a result, the statements for the rolled back transaction are also logged in binary log and replayed in slave.
For a table with unique index, the slave receives two entries with duplicate values (one of which is actually rolled back in master) causing "duplicate key" error and replication stops.
Currently, it occurs in "Group Replication" which uses "high priority" transactions internally. However, it is possible to reproduce the issue by enabling "high priority" transaction using debug variables without "Group Replication" set up.
How to repeat:
--source include/master-slave.inc
--connection server_1
CREATE TABLE t1 (c1 INT NOT NULL PRIMARY KEY) ENGINE=InnoDB;
BEGIN;
INSERT INTO t1 VALUES (1);
--connection master
--source include/start_transaction_high_prio.inc
INSERT INTO t1 VALUES (1);
COMMIT;
SELECT * FROM t1;
--connection server_1
--error 1213
INSERT INTO t1 VALUES (2); ## This will hit ER1213
INSERT INTO t1 VALUES (3); ## This will succeed.
COMMIT;
--connection master
SELECT * FROM t1;
DROP TABLE t1;
--source include/sync_slave_sql_with_master.inc
Suggested fix:
The handler interface innobase_rollback should not return error when a transaction is rolled back asynchronously or marked for asynchronous rollback. It should be enough to return the error during other statement execution. innobase_rollback needs to ensure that the rollback(asynchronous) is complete and then return success.