MySQL Bugs: #80898: Replication stops after transaction is rolled back asynchronously in master

Bug #80898	Replication stops after transaction is rolled back asynchronously in master
Submitted:	30 Mar 2016 5:00	Modified:	1 Apr 2016 13:22
Reporter:	Debarun Banerjee	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S3 (Non-critical)
Version:	5.7	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
After a transaction trx1 is asynchronously rolled back by a "high priority" transaction trx2, any statements executed on trax1 gets deadlock error ER_LOCK_DEADLOCK. This is currently true also for rollback statement i.e. ROLLBACK also returns ER_LOCK_DEADLOCK.

This causes issue with replication. After ER_LOCK_DEADLOCK is returned while executing a statement, server attempts complete rollback of the transaction with each storage engine. In above scenario, Innodb returns ER_LOCK_DEADLOCK again when innobase_rollback is called. The error on rollback causes MYSQL_BIN_LOG::rollback to skip some clean up operation. As a result, the statements for the rolled back transaction are also logged in binary log and replayed in slave.

For a table with unique index, the slave receives two entries with duplicate values (one of which is actually rolled back in master) causing "duplicate key" error and replication stops.

Currently, it occurs in "Group Replication" which uses "high priority" transactions internally. However, it is possible to reproduce the issue by enabling "high priority" transaction using debug variables without "Group Replication" set up.

How to repeat:
--source include/master-slave.inc
--connection server_1
CREATE TABLE t1 (c1 INT NOT NULL PRIMARY KEY) ENGINE=InnoDB;
BEGIN;
INSERT INTO t1 VALUES (1);

--connection master
--source include/start_transaction_high_prio.inc
INSERT INTO t1 VALUES (1);
COMMIT;
SELECT * FROM t1;

--connection server_1
--error 1213
INSERT INTO t1 VALUES (2); ## This will hit ER1213
INSERT INTO t1 VALUES (3); ## This will succeed.
COMMIT;

--connection master
SELECT * FROM t1;
DROP TABLE t1;
--source include/sync_slave_sql_with_master.inc

Suggested fix:
The handler interface innobase_rollback should not return error when a transaction is rolled back asynchronously or marked for asynchronous rollback. It should be enough to return the error during other statement execution. innobase_rollback needs to ensure that the rollback(asynchronous) is complete and then return success.

Posted by developer:
 
Fixed as of the upcoming 5.7.13, 5.8.0 release, and here's the changelog entry:

Statements executed in a transaction that was rolled back asynchronously
by a higher priority transaction resulted in a deadlock error and
subsequent replication failure.