Description:
Hi,
when doing a multiple-table DELETE that is anticipating a foreign key ON DELETE CASCADE, the statements works on the primary but it breaks replication. See How to repeat for details.
I traced this back to the way a multiple-table DELETE is written in the binary logs. See How to repeat for details.
How to repeat is for 8.0.23, but 5.7.33 also breaks with the same steps. I do not know if the t2 delete should not be in the binary logs, or if the replication applier should not try executing the t2 delete_row.
Many thanks for looking into this,
Jean-François Gagné
How to repeat:
# Use dbdeployer to create a test environment.
dbdeployer deploy replication mysql_8.0.23
# Initialize everything, with table t2 having a FK ON DELETE CASCADE on t.
./m <<< "create database test_jfg"
./m test_jfg <<< "
create table t(id bigint not null auto_increment primary key, v int);
create table t2(id bigint not null primary key, v2 int, CONSTRAINT FOREIGN KEY (id) REFERENCES t (id) ON DELETE CASCADE);
insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2);
insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2);
insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2);
select * from t; select * from t2;
"
id v
1 1
2 1
3 1
id v2
1 2
2 2
3 2
# A standard delete works well.
./m test_jfg <<< "delete from t where id = 1; select * from t; select * from t2"
id v
2 1
3 1
id v2
2 2
3 2
./s1 test_jfg <<< "select * from t; select * from t2"
id v
2 1
3 1
id v2
2 2
3 2
# But a multi-table breaks replication after succeeding on the primary.
./m test_jfg <<< "delete t, t2 from t left join t2 on t.id = t2.id where t.id = 2; select * from t; select * from t2"
id v
3 1
id v2
3 2
./s1 test_jfg <<< "select * from t; select * from t2; show slave status\G"
id v
2 1
3 1
id v2
2 2
3 2
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: rsandbox
Master_Port: 21324
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 11488
Relay_Log_File: mysql-relay.000002
Relay_Log_Pos: 11311
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table test_jfg.t2; Can't find record in 't2', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000001, end_log_pos 11457
Skip_Counter: 0
Exec_Master_Log_Pos: 11096
Relay_Log_Space: 11908
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table test_jfg.t2; Can't find record in 't2', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000001, end_log_pos 11457
Replicate_Ignore_Server_Ids:
Master_Server_Id: 100
Master_UUID: 00021324-1111-1111-1111-111111111111
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 210212 16:23:52
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 1
Network_Namespace:
# I suspect the cause being the multi-table delete written in the binary logs in a way that the replica executing things not working for the On DELETE CASCADE.
~/opt/mysql/mysql_8.0.23/bin/mysqlbinlog -vv master/data/mysql-bin.000001
[...]
[This is the 1st delete.]
# at 10965
#210212 16:23:08 server id 100 end_log_pos 11017 CRC32 0x26d1fa60 Table_map: `test_jfg`.`t` mapped to number 109
# at 11017
#210212 16:23:08 server id 100 end_log_pos 11065 CRC32 0x176e23dd Delete_rows: table id 109 flags: STMT_END_F
BINLOG '
bKsmYBNkAAAANAAAAAkrAAAAAG0AAAAAAAMACHRlc3RfamZnAAF0AAIIAwACAQEAYPrRJg==
bKsmYCBkAAAAMAAAADkrAAAAAG0AAAAAAAEAAgAC/wABAAAAAAAAAAEAAADdI24X
'/*!*/;
### DELETE FROM `test_jfg`.`t`
### WHERE
### @1=1 /* LONGINT meta=0 nullable=0 is_null=0 */
### @2=1 /* INT meta=0 nullable=1 is_null=0 */
[...]
[This is the 2nd delete.]
# at 11256
#210212 16:23:52 server id 100 end_log_pos 11308 CRC32 0xf4ea7473 Table_map: `test_jfg`.`t` mapped to number 109
# at 11308
#210212 16:23:52 server id 100 end_log_pos 11361 CRC32 0x734b96d6 Table_map: `test_jfg`.`t2` mapped to number 111
# at 11361
#210212 16:23:52 server id 100 end_log_pos 11409 CRC32 0x0ac2c69f Delete_rows: table id 109
# at 11409
#210212 16:23:52 server id 100 end_log_pos 11457 CRC32 0xbe957aa6 Delete_rows: table id 111 flags: STMT_END_F
BINLOG '
mKsmYBNkAAAANAAAACwsAAAAAG0AAAAAAAMACHRlc3RfamZnAAF0AAIIAwACAQEAc3Tq9A==
mKsmYBNkAAAANQAAAGEsAAAAAG8AAAAAAAMACHRlc3RfamZnAAJ0MgACCAMAAgEBANaWS3M=
mKsmYCBkAAAAMAAAAJEsAAAAAG0AAAAAAAAAAgAC/wACAAAAAAAAAAEAAACfxsIK
mKsmYCBkAAAAMAAAAMEsAAAAAG8AAAAAAAEAAgAC/wACAAAAAAAAAAIAAACmepW+
'/*!*/;
### DELETE FROM `test_jfg`.`t`
### WHERE
### @1=2 /* LONGINT meta=0 nullable=0 is_null=0 */
### @2=1 /* INT meta=0 nullable=1 is_null=0 */
### DELETE FROM `test_jfg`.`t2`
### WHERE
### @1=2 /* LONGINT meta=0 nullable=0 is_null=0 */
### @2=2 /* INT meta=0 nullable=1 is_null=0 */
[...]