Description:
Hi,
when doing a multiple-table DELETE that is anticipating a foreign key ON DELETE CASCADE, the statements works on the primary but it breaks replication. See How to repeat for details.
I traced this back to the way a multiple-table DELETE is written in the binary logs. See How to repeat for details.
How to repeat is for 8.0.23, but 5.7.33 also breaks with the same steps. I do not know if the t2 delete should not be in the binary logs, or if the replication applier should not try executing the t2 delete_row.
Many thanks for looking into this,
Jean-François Gagné
How to repeat:
# Use dbdeployer to create a test environment.
dbdeployer deploy replication mysql_8.0.23
# Initialize everything, with table t2 having a FK ON DELETE CASCADE on t.
./m <<< "create database test_jfg"
./m test_jfg <<< "
create table t(id bigint not null auto_increment primary key, v int);
create table t2(id bigint not null primary key, v2 int, CONSTRAINT FOREIGN KEY (id) REFERENCES t (id) ON DELETE CASCADE);
insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2);
insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2);
insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2);
select * from t; select * from t2;
"
id v
1 1
2 1
3 1
id v2
1 2
2 2
3 2
# A standard delete works well.
./m test_jfg <<< "delete from t where id = 1; select * from t; select * from t2"
id v
2 1
3 1
id v2
2 2
3 2
./s1 test_jfg <<< "select * from t; select * from t2"
id v
2 1
3 1
id v2
2 2
3 2
# But a multi-table breaks replication after succeeding on the primary.
./m test_jfg <<< "delete t, t2 from t left join t2 on t.id = t2.id where t.id = 2; select * from t; select * from t2"
id v
3 1
id v2
3 2
./s1 test_jfg <<< "select * from t; select * from t2; show slave status\G"
id v
2 1
3 1
id v2
2 2
3 2
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: rsandbox
Master_Port: 21324
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 11488
Relay_Log_File: mysql-relay.000002
Relay_Log_Pos: 11311
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table test_jfg.t2; Can't find record in 't2', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000001, end_log_pos 11457
Skip_Counter: 0
Exec_Master_Log_Pos: 11096
Relay_Log_Space: 11908
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table test_jfg.t2; Can't find record in 't2', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000001, end_log_pos 11457
Replicate_Ignore_Server_Ids:
Master_Server_Id: 100
Master_UUID: 00021324-1111-1111-1111-111111111111
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 210212 16:23:52
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 1
Network_Namespace:
# I suspect the cause being the multi-table delete written in the binary logs in a way that the replica executing things not working for the On DELETE CASCADE.
~/opt/mysql/mysql_8.0.23/bin/mysqlbinlog -vv master/data/mysql-bin.000001
[...]
[This is the 1st delete.]
# at 10965
#210212 16:23:08 server id 100 end_log_pos 11017 CRC32 0x26d1fa60 Table_map: `test_jfg`.`t` mapped to number 109
# at 11017
#210212 16:23:08 server id 100 end_log_pos 11065 CRC32 0x176e23dd Delete_rows: table id 109 flags: STMT_END_F
BINLOG '
bKsmYBNkAAAANAAAAAkrAAAAAG0AAAAAAAMACHRlc3RfamZnAAF0AAIIAwACAQEAYPrRJg==
bKsmYCBkAAAAMAAAADkrAAAAAG0AAAAAAAEAAgAC/wABAAAAAAAAAAEAAADdI24X
'/*!*/;
### DELETE FROM `test_jfg`.`t`
### WHERE
### @1=1 /* LONGINT meta=0 nullable=0 is_null=0 */
### @2=1 /* INT meta=0 nullable=1 is_null=0 */
[...]
[This is the 2nd delete.]
# at 11256
#210212 16:23:52 server id 100 end_log_pos 11308 CRC32 0xf4ea7473 Table_map: `test_jfg`.`t` mapped to number 109
# at 11308
#210212 16:23:52 server id 100 end_log_pos 11361 CRC32 0x734b96d6 Table_map: `test_jfg`.`t2` mapped to number 111
# at 11361
#210212 16:23:52 server id 100 end_log_pos 11409 CRC32 0x0ac2c69f Delete_rows: table id 109
# at 11409
#210212 16:23:52 server id 100 end_log_pos 11457 CRC32 0xbe957aa6 Delete_rows: table id 111 flags: STMT_END_F
BINLOG '
mKsmYBNkAAAANAAAACwsAAAAAG0AAAAAAAMACHRlc3RfamZnAAF0AAIIAwACAQEAc3Tq9A==
mKsmYBNkAAAANQAAAGEsAAAAAG8AAAAAAAMACHRlc3RfamZnAAJ0MgACCAMAAgEBANaWS3M=
mKsmYCBkAAAAMAAAAJEsAAAAAG0AAAAAAAAAAgAC/wACAAAAAAAAAAEAAACfxsIK
mKsmYCBkAAAAMAAAAMEsAAAAAG8AAAAAAAEAAgAC/wACAAAAAAAAAAIAAACmepW+
'/*!*/;
### DELETE FROM `test_jfg`.`t`
### WHERE
### @1=2 /* LONGINT meta=0 nullable=0 is_null=0 */
### @2=1 /* INT meta=0 nullable=1 is_null=0 */
### DELETE FROM `test_jfg`.`t2`
### WHERE
### @1=2 /* LONGINT meta=0 nullable=0 is_null=0 */
### @2=2 /* INT meta=0 nullable=1 is_null=0 */
[...]
Description: Hi, when doing a multiple-table DELETE that is anticipating a foreign key ON DELETE CASCADE, the statements works on the primary but it breaks replication. See How to repeat for details. I traced this back to the way a multiple-table DELETE is written in the binary logs. See How to repeat for details. How to repeat is for 8.0.23, but 5.7.33 also breaks with the same steps. I do not know if the t2 delete should not be in the binary logs, or if the replication applier should not try executing the t2 delete_row. Many thanks for looking into this, Jean-François Gagné How to repeat: # Use dbdeployer to create a test environment. dbdeployer deploy replication mysql_8.0.23 # Initialize everything, with table t2 having a FK ON DELETE CASCADE on t. ./m <<< "create database test_jfg" ./m test_jfg <<< " create table t(id bigint not null auto_increment primary key, v int); create table t2(id bigint not null primary key, v2 int, CONSTRAINT FOREIGN KEY (id) REFERENCES t (id) ON DELETE CASCADE); insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2); insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2); insert into t(v) values(1); insert into test_jfg.t2 values(LAST_INSERT_ID() , 2); select * from t; select * from t2; " id v 1 1 2 1 3 1 id v2 1 2 2 2 3 2 # A standard delete works well. ./m test_jfg <<< "delete from t where id = 1; select * from t; select * from t2" id v 2 1 3 1 id v2 2 2 3 2 ./s1 test_jfg <<< "select * from t; select * from t2" id v 2 1 3 1 id v2 2 2 3 2 # But a multi-table breaks replication after succeeding on the primary. ./m test_jfg <<< "delete t, t2 from t left join t2 on t.id = t2.id where t.id = 2; select * from t; select * from t2" id v 3 1 id v2 3 2 ./s1 test_jfg <<< "select * from t; select * from t2; show slave status\G" id v 2 1 3 1 id v2 2 2 3 2 *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: rsandbox Master_Port: 21324 Connect_Retry: 60 Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 11488 Relay_Log_File: mysql-relay.000002 Relay_Log_Pos: 11311 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1032 Last_Error: Could not execute Delete_rows event on table test_jfg.t2; Can't find record in 't2', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000001, end_log_pos 11457 Skip_Counter: 0 Exec_Master_Log_Pos: 11096 Relay_Log_Space: 11908 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1032 Last_SQL_Error: Could not execute Delete_rows event on table test_jfg.t2; Can't find record in 't2', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000001, end_log_pos 11457 Replicate_Ignore_Server_Ids: Master_Server_Id: 100 Master_UUID: 00021324-1111-1111-1111-111111111111 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 210212 16:23:52 Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: Master_public_key_path: Get_master_public_key: 1 Network_Namespace: # I suspect the cause being the multi-table delete written in the binary logs in a way that the replica executing things not working for the On DELETE CASCADE. ~/opt/mysql/mysql_8.0.23/bin/mysqlbinlog -vv master/data/mysql-bin.000001 [...] [This is the 1st delete.] # at 10965 #210212 16:23:08 server id 100 end_log_pos 11017 CRC32 0x26d1fa60 Table_map: `test_jfg`.`t` mapped to number 109 # at 11017 #210212 16:23:08 server id 100 end_log_pos 11065 CRC32 0x176e23dd Delete_rows: table id 109 flags: STMT_END_F BINLOG ' bKsmYBNkAAAANAAAAAkrAAAAAG0AAAAAAAMACHRlc3RfamZnAAF0AAIIAwACAQEAYPrRJg== bKsmYCBkAAAAMAAAADkrAAAAAG0AAAAAAAEAAgAC/wABAAAAAAAAAAEAAADdI24X '/*!*/; ### DELETE FROM `test_jfg`.`t` ### WHERE ### @1=1 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2=1 /* INT meta=0 nullable=1 is_null=0 */ [...] [This is the 2nd delete.] # at 11256 #210212 16:23:52 server id 100 end_log_pos 11308 CRC32 0xf4ea7473 Table_map: `test_jfg`.`t` mapped to number 109 # at 11308 #210212 16:23:52 server id 100 end_log_pos 11361 CRC32 0x734b96d6 Table_map: `test_jfg`.`t2` mapped to number 111 # at 11361 #210212 16:23:52 server id 100 end_log_pos 11409 CRC32 0x0ac2c69f Delete_rows: table id 109 # at 11409 #210212 16:23:52 server id 100 end_log_pos 11457 CRC32 0xbe957aa6 Delete_rows: table id 111 flags: STMT_END_F BINLOG ' mKsmYBNkAAAANAAAACwsAAAAAG0AAAAAAAMACHRlc3RfamZnAAF0AAIIAwACAQEAc3Tq9A== mKsmYBNkAAAANQAAAGEsAAAAAG8AAAAAAAMACHRlc3RfamZnAAJ0MgACCAMAAgEBANaWS3M= mKsmYCBkAAAAMAAAAJEsAAAAAG0AAAAAAAAAAgAC/wACAAAAAAAAAAEAAACfxsIK mKsmYCBkAAAAMAAAAMEsAAAAAG8AAAAAAAEAAgAC/wACAAAAAAAAAAIAAACmepW+ '/*!*/; ### DELETE FROM `test_jfg`.`t` ### WHERE ### @1=2 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2=1 /* INT meta=0 nullable=1 is_null=0 */ ### DELETE FROM `test_jfg`.`t2` ### WHERE ### @1=2 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2=2 /* INT meta=0 nullable=1 is_null=0 */ [...]