Bug #102586 Foreign Key ON DELETE CASCADE breaks with RBR and multiple-table DELETE.
Submitted: 12 Feb 2021 16:35 Modified: 12 Oct 2023 16:30
Reporter: Jean-François Gagné Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:8.1.0, 8.0.34, 8.0.23, 5.7.33 OS:Any
Assigned to: CPU Architecture:Any

[12 Feb 2021 16:35] Jean-François Gagné
Description:
Hi,

when doing a multiple-table DELETE that is anticipating a foreign key ON DELETE CASCADE, the statements works on the primary but it breaks replication.  See How to repeat for details.

I traced this back to the way a multiple-table DELETE is written in the binary logs.  See How to repeat for details.

How to repeat is for 8.0.23, but 5.7.33 also breaks with the same steps.  I do not know if the t2 delete should not be in the binary logs, or if the replication applier should not try executing the t2 delete_row.

Many thanks for looking into this,

Jean-François Gagné

How to repeat:
# Use dbdeployer to create a test environment.
dbdeployer deploy replication mysql_8.0.23

# Initialize everything, with table t2 having a FK ON DELETE CASCADE on t.
./m <<< "create database test_jfg"
./m test_jfg <<< "
  create table t(id bigint not null auto_increment primary key, v int);
  create table t2(id bigint not null primary key, v2 int, CONSTRAINT FOREIGN KEY (id) REFERENCES t (id) ON DELETE CASCADE);
  insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2);
  insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2);
  insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2);
  select * from t; select * from t2;
"
id      v
1       1
2       1
3       1
id      v2
1       2
2       2
3       2

# A standard delete works well.
./m test_jfg <<< "delete from t where id = 1; select * from t; select * from t2"
id      v
2       1
3       1
id      v2
2       2
3       2

./s1 test_jfg <<< "select * from t; select * from t2"
id      v
2       1
3       1
id      v2
2       2
3       2

# But a multi-table breaks replication after succeeding on the primary.
./m test_jfg <<< "delete t, t2 from t left join t2 on t.id = t2.id where t.id = 2; select * from t; select * from t2"
id      v
3       1
id      v2
3       2

./s1 test_jfg <<< "select * from t; select * from t2; show slave status\G"
id      v
2       1
3       1
id      v2
2       2
3       2
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: rsandbox
                  Master_Port: 21324
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 11488
               Relay_Log_File: mysql-relay.000002
                Relay_Log_Pos: 11311
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1032
                   Last_Error: Could not execute Delete_rows event on table test_jfg.t2; Can't find record in 't2', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000001, end_log_pos 11457
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 11096
              Relay_Log_Space: 11908
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1032
               Last_SQL_Error: Could not execute Delete_rows event on table test_jfg.t2; Can't find record in 't2', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000001, end_log_pos 11457
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 100
                  Master_UUID: 00021324-1111-1111-1111-111111111111
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 210212 16:23:52
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
       Master_public_key_path:
        Get_master_public_key: 1
            Network_Namespace:

# I suspect the cause being the multi-table delete written in the binary logs in a way that the replica executing things not working for the On DELETE CASCADE.
~/opt/mysql/mysql_8.0.23/bin/mysqlbinlog -vv master/data/mysql-bin.000001
[...]
[This is the 1st delete.]
# at 10965
#210212 16:23:08 server id 100  end_log_pos 11017 CRC32 0x26d1fa60      Table_map: `test_jfg`.`t` mapped to number 109
# at 11017
#210212 16:23:08 server id 100  end_log_pos 11065 CRC32 0x176e23dd      Delete_rows: table id 109 flags: STMT_END_F

BINLOG '
bKsmYBNkAAAANAAAAAkrAAAAAG0AAAAAAAMACHRlc3RfamZnAAF0AAIIAwACAQEAYPrRJg==
bKsmYCBkAAAAMAAAADkrAAAAAG0AAAAAAAEAAgAC/wABAAAAAAAAAAEAAADdI24X
'/*!*/;
### DELETE FROM `test_jfg`.`t`
### WHERE
###   @1=1 /* LONGINT meta=0 nullable=0 is_null=0 */
###   @2=1 /* INT meta=0 nullable=1 is_null=0 */
[...]
[This is the 2nd delete.]
# at 11256
#210212 16:23:52 server id 100  end_log_pos 11308 CRC32 0xf4ea7473      Table_map: `test_jfg`.`t` mapped to number 109
# at 11308
#210212 16:23:52 server id 100  end_log_pos 11361 CRC32 0x734b96d6      Table_map: `test_jfg`.`t2` mapped to number 111
# at 11361
#210212 16:23:52 server id 100  end_log_pos 11409 CRC32 0x0ac2c69f      Delete_rows: table id 109
# at 11409
#210212 16:23:52 server id 100  end_log_pos 11457 CRC32 0xbe957aa6      Delete_rows: table id 111 flags: STMT_END_F

BINLOG '
mKsmYBNkAAAANAAAACwsAAAAAG0AAAAAAAMACHRlc3RfamZnAAF0AAIIAwACAQEAc3Tq9A==
mKsmYBNkAAAANQAAAGEsAAAAAG8AAAAAAAMACHRlc3RfamZnAAJ0MgACCAMAAgEBANaWS3M=
mKsmYCBkAAAAMAAAAJEsAAAAAG0AAAAAAAAAAgAC/wACAAAAAAAAAAEAAACfxsIK
mKsmYCBkAAAAMAAAAMEsAAAAAG8AAAAAAAEAAgAC/wACAAAAAAAAAAIAAACmepW+
'/*!*/;
### DELETE FROM `test_jfg`.`t`
### WHERE
###   @1=2 /* LONGINT meta=0 nullable=0 is_null=0 */
###   @2=1 /* INT meta=0 nullable=1 is_null=0 */
### DELETE FROM `test_jfg`.`t2`
### WHERE
###   @1=2 /* LONGINT meta=0 nullable=0 is_null=0 */
###   @2=2 /* INT meta=0 nullable=1 is_null=0 */
[...]
[13 Feb 2021 6:59] MySQL Verification Team
Hello Jean-François,

Thank you for the report and test case.
Verified as described with 8.0.23 build.

regards,
Umesh
[12 Oct 2023 16:30] Jean-François Gagné
Adding versions 8.0.34 and 8.1.0 (was 8.0.23, 5.7.33): I just tested and they are also affected.

Related: Bug#80821 (this bug might even be a duplicate of that other bug).