Bug #97836 In group replication foreign key will break slave aplier
Submitted: 30 Nov 2019 12:54 Modified: 5 Dec 2019 7:49
Reporter: zhang chen Email Updates:
Status: Verified Impact on me:
Category:MySQL Server: Group Replication Severity:S2 (Serious)
Version:8.0.18 OS:Linux (CentOS Linux release 7.6.1810)
Assigned to: CPU Architecture:Any (Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz)
Tags: 1452;slave down

[30 Nov 2019 12:54] zhang chen
In group replication, a table has foreign keys that depend on another table. If the main database is executed successfully, the slave may execute abnormally and exit the group.

I have tried to commit_order and other configurations, but they are still unsolvable

Currently, it can only be solved by setting slave_parallel_workers to 0, but it seriously affects efficiency.

How to repeat:
create table zxc.a
	a int,
	b char(10),
	c char(10),
	primary key (a,b,c)

create table zxc.b
	a int,
	b char(10),
	c char(10),
	primary key (a,b,c),
	CONSTRAINT foreign_key_b2a FOREIGN KEY (a) REFERENCES a (a)

use a shell run these:

while true
        let i=i+1
        mysql zxc -e"insert into a (a,b,c) values ($i,1,1);insert into b (a,b,c) values ($i,1,1);"

        mysql zxc -e"delete from b where a=$i;delete from a where a=$i;"
        echo $i
        if [ $i -gt $count ]
                exit 0;

It works normally on the master, but you can see mysql.err in slave like these:
2019-11-30T20:40:52.064868+08:00 335242 [ERROR] [MY-010584] [Repl] Slave SQL for channel 'group_replication_applier': Worker 2 failed executing transaction 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:12492217'; Could not execute Write_rows event on table zxc.b; Cannot add or update a child row: a foreign key constraint fails (`zxc`.`b`, CONSTRAINT `foreign_key_b2a` FOREIGN KEY (`a`) REFERENCES `a` (`a`)), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW, Error_code: MY-001452
[3 Dec 2019 7:12] MySQL Verification Team
Hello Zhang chen,

Thank you for the report and test case.
Verified as described.

[5 Dec 2019 7:49] zhang chen
Is there any way to solve this problem now?
[11 Aug 2020 9:17] Andrew Ernst
I've seen this same situation play out somewhat randomly in 3-node Group Replication clusters.

One of the two secondary nodes will randomly throw this error.

I believe it's related to having parallel replication appliers (4). I have not seen this occur when the appliers are reduced to 1.

It has been observed in MySQL 8.0.19 in multiple environments (running on CentOS 7.8)
[21 Aug 2020 23:04] Andrew Ernst
This remains a problem on MySQL 8.0.21.
I'm seeing this regardless of cpu type.  Parallel appliers are set to 4 in my environment.

model name	: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
[24 Feb 22:29] Keith Lammers
I'm running into this issue still on 8.0.23 as well. Default parallel workers at 4.
[12 Mar 10:42] Bin Wang
If foreign key is part of another table's primary key,then Group replication could not build correct last_committed value which causes wrong parallel slave working.
That's why "slave_parallel_workers=1" all works right and "slave_parallel_workers>1" probably causes slave nodes error.