Bug #80821 Replication breaks if multi-table DELETE is used in conjunction with Foreign Key
Submitted: 22 Mar 2016 11:38 Modified: 31 Mar 2016 7:36
Reporter: Philip Stoev Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S3 (Non-critical)
Version:5.7, 5.7.11 OS:Any
Assigned to: CPU Architecture:Any

[22 Mar 2016 11:38] Philip Stoev
Description:
If two tables are in a foreign-key relationship and a multi-table DELETE statement operates on both, replication will break.

2016-03-17T07:35:58.415740Z 8 [ERROR] Slave SQL for channel '': Could not execute Delete_rows event on table test.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master-bin.000001, end_log_pos 1692, Error_code: 1032
2016-03-17T07:35:58.415776Z 8 [Warning] Slave: Can't find record in 't1' Error_code: 1032
2016-03-17T07:35:58.415787Z 8 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'master-bin.000001' position 1378

5.6 does not seem to be affected.

How to repeat:
Test case:

--source include/not_ndb_default.inc
--source include/not_group_replication_plugin.inc
--source include/have_binlog_format_row.inc
--source include/master-slave.inc

--connection master

CREATE TABLE t0 (
f0 INT PRIMARY KEY
) engine=innodb;

CREATE TABLE t1 (
f1 INT PRIMARY KEY,
f0 INTEGER,
FOREIGN KEY (f0)
REFERENCES t0(f0)
ON DELETE CASCADE
) engine=innodb;

INSERT INTO t0 VALUES (0), (1);
INSERT INTO t1 VALUES (0, 0);
INSERT INTO t1 VALUES (1, 0);

DELETE t0.*, t1.* FROM t0, t1 WHERE t0.f0 = 0 AND t1.f1 = 0;

SELECT COUNT(*) = 1 FROM t0;
SELECT COUNT(*) = 0 FROM t1;

--connection slave

SELECT COUNT(*) = 1 FROM t0;
SELECT COUNT(*) = 0 FROM t1;
DROP TABLE t0;
DROP TABLE t1;

--source include/rpl_end.inc

Suggested fix:
<quote>
In 5.6, this delete causes only one binlog event:
DELETE FROM `test`.`t0`
WHERE
@1=0

Whereas in 5.7 this delete causes binlog events for both t0 and t1:

DELETE FROM `test`.`t0`
WHERE
@1=0

DELETE FROM `test`.`t1`
WHERE
@1=0
@2=0

When the 5.6 binlog event hits slave node, applying will cascade to do the deletes on table t1 as well.
What happens in 5.7 version, is that the delete on t0 does this similar cascading delete, and when actual delete event for table t1 tries to update, it fails for missing row.
</quote>
[31 Mar 2016 7:36] Umesh Shastry
Hello Philip,

Thank you for the report and test case.
Observed this issue with 5.7.11(including 5.7.13 daily build) builds(where as test case passed on 5.6.29 build).

Thanks,
Umesh
[17 Feb 20:13] Seweryn Ozog
5.7.18, 5.7.20 as well
[17 Feb 21:09] Seweryn Ozog
"S3 (Non-critical)" for replication problem - really?