Bug #96758 Rollbacks with Foreign Keys on single node
Submitted: 4 Sep 2019 23:29 Modified: 6 Sep 2019 15:48
Reporter: Jasmine Schladen Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S2 (Serious)
Version:8.0.16-7 OS:Linux
Assigned to: CPU Architecture:Any

[4 Sep 2019 23:29] Jasmine Schladen
Description:
With the group replication plugin started, we see frequent rollbacks doing parallel INSERTs to a table with a foreign key (without CASCADE). We are using multi-master replication.

This occurs:
* With a single node OR multiple nodes in the cluster

This does NOT occur:
* With group replication stopped
* When the foreign key is removed
* Using single primary mode

Expected results: Inserts succeed.
Actual results: Intermittent errors "ERROR 3101 (40000) at line 1: Plugin instructed the server to rollback the current transaction."

We have debugged the code flow via gdb, and the rollback appears to occur on this line: 
https://github.com/mysql/mysql-server/blob/4869291f7ee258e136ef03f5a50135fe7329ffb9/plugin...

      /*
        If the previous certified transaction snapshot version is not
        a subset of the incoming transaction snapshot version, the current
        transaction was executed on top of outdated data, so it will be
        negatively certified. Otherwise, this transaction is marked
        certified and goes into applier.
      */
      if (certified_write_set_snapshot_version != NULL &&
          !certified_write_set_snapshot_version->is_subset(snapshot_version))
        goto end;
    }

(gdb) n
626          Gtid_set *certified_write_set_snapshot_version =
(gdb)
636          if (certified_write_set_snapshot_version != NULL &&
(gdb) info locals
certified_write_set_snapshot_version = 0x7fe0e40129c8
it = 0x7fe0e402a120 "5+NNaPufON8="
result = 0
has_write_set = true
transaction_last_committed = 15
__FUNCTION__ = "certify"
(gdb) info args
this = 0x7fe0bc1e5130
snapshot_version = 0x7fe0e40268c0
write_set = 0x7fe0e4026798
generate_group_id = <optimized out>
member_uuid = 0x7fe0e4011360 "57991150-cf49-11e9-ba91-0af9ee275c98"
gle = 0x7fe10128ddb0
local_transaction = true
(gdb) n
787          goto end; /* purecov: inspected */

We're unsure where to continue debugging this. The high frequency of rollbacks we see (10-25% or so) seems incorrect, especially since this is a single node with no conflicts. We understand that rollbacks are expected to occur in case of conflict, but why are we seeing so many rollbacks in this scenario with no conflicts?

How to repeat:
Create two tables. One table should have an auto increment ID, and the other table should have a foreign key to the first table.

CREATE DATABASE REPLICATIONTEST;

CREATE TABLE IF NOT EXISTS T3 (ID INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (ID));

CREATE TABLE IF NOT EXISTS T4 (NAME VARCHAR(100) NOT NULL, T3_ID INT NOT NULL, PRIMARY KEY (NAME), CONSTRAINT ID_FK_4 FOREIGN KEY (T3_ID) REFERENCES T3 (ID) ON UPDATE NO ACTION ON DELETE NO ACTION);

INSERT INTO T3 VALUES();

mysql> SELECT * FROM T3;
+----+
| ID |
+----+
|  4 |
+----+
1 row in set (0.00 sec)

Create a script like the following:

#!/usr/bin/env bash

for value in {1..200}
    do
    name=test$(date +%s)$RANDOM
    mysql -u root -ppassword -D REPLICATIONTEST -e "INSERT INTO \`T4\` VALUES ('$name',4);"
    done

echo All done

Run script twice in parallel (or three times in parallel to increase probability of reproducing). Some iterations should result in the following error:
ERROR 3101 (40000) at line 1: Plugin instructed the server to rollback the current transaction.
[6 Sep 2019 15:48] MySQL Verification Team
Hi,

Thanks for report and the test case, I verified it in first try (running the 4 scripts concurrently).