Bug #77684 DROP TABLE IF EXISTS may brake replication if slave has replication filters
Submitted: 10 Jul 2015 23:59 Modified: 29 Feb 2016 12:19
Reporter: Fernando Laudares Camargos Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.6.25, 5.6.27 OS:Linux
Assigned to: CPU Architecture:Any

[10 Jul 2015 23:59] Fernando Laudares Camargos
Description:
If a replica has replication filters on a given database and DROP TABLE IF EXISTS is issued on a table that holds a key being used as Foreign Key by another table the statement will fail as expected with error:

ERROR 1217 (23000): Cannot delete or update a parent row: a foreign key constraint fails

however this will break replication on the replica with:

Last_SQL_Error: Query caused different errors on master and slave.     Error on master: message (format)='Cannot delete or update a parent row: a foreign key constraint fails' error code=1217 ; Error on slave: actual message='no error', error code=0. Default database: 'db1'. Query: 'DROP TABLE IF EXISTS `table1` /* generated by server */'

How to repeat:
Setup master with:

binlog_format=STATEMENT
or
binlog_format=ROW

Setup replica with:

replicate-ignore-db = db1
replicate-wild-ignore-table = db1.%

Then run on the master the following statements:

CREATE DATABASE `db1`;

USE `db1`;

CREATE TABLE `table1` (`ID` bigint(20) primary key) ENGINE=InnoDB;

CREATE TABLE `table2` ( `ID` bigint(20) NOT NULL AUTO_INCREMENT, `DIVISION_ID` bigint(20) DEFAULT NULL, PRIMARY KEY (`ID`), KEY `FK_TABLE1_DIVISION_1` (`DIVISION_ID`), CONSTRAINT `FK_TABLE1_DIVISION_1` FOREIGN KEY (`DIVISION_ID`) REFERENCES `table1` (`ID`) ON DELETE CASCADE ) ENGINE=InnoDB;

DROP TABLE IF EXISTS `db1`.`table1`;

Suggested fix:
The problem seems to be related to the "USE" above as the following works as expected:

CREATE DATABASE `db1`;
CREATE TABLE `db1`.`table1` (`ID` bigint(20) primary key) ENGINE=InnoDB;
CREATE TABLE `db1`.`table2` ( `ID` bigint(20) NOT NULL AUTO_INCREMENT, `DIVISION_ID` bigint(20) DEFAULT NULL, PRIMARY KEY (`ID`), KEY `FK_TABLE1_DIVISION_1` (`DIVISION_ID`), CONSTRAINT `FK_TABLE1_DIVISION_1` FOREIGN KEY (`DIVISION_ID`) REFERENCES `db1`.`table1` (`ID`) ON DELETE CASCADE ) ENGINE=InnoDB;
DROP TABLE IF EXISTS `db1`.`table1`;

however if you add an USE `db1` after the CREATE DATABASE statement the replication error will follow.
[13 Jul 2015 13:08] MySQL Verification Team
Hello Fernando,

Thank you for the report.

Thanks,
Umesh
[13 Jul 2015 13:10] MySQL Verification Team
mtr test case and results - 5.6.27, 5.7.8

Attachment: 77684_5.6.27.results (application/octet-stream, text), 45.62 KiB.

[13 Jul 2015 13:11] MySQL Verification Team
// Imho 2 issues here 9a) Failed query logged to binary log and (b) replicate** option ignored on slave despite row based repl
[13 Jul 2015 13:25] MySQL Verification Team
related - http://bugs.mysql.com/bug.php?id=76493
[31 Jul 2015 18:19] hussain patel
Hello Oracle

Do we have any update on this bug? Do we have any ETA on this?

Thanks
Hussain
[29 Feb 2016 12:19] David Moss
Thanks for your feedback. This has been fixed in upcoming versions and the following was added to the 5.6.30 and 5.7.12 change logs:
If a query on a master generated an error and partial results were written to the binary log, for example due to a DROP TABLE IF EXISTS statement applying to multiple tables that would break foreign key constraints, when a slave configured with replication filters encountered the query it could be incorrectly binary logged. This caused errors such as:

Last_SQL_Error: Query caused different errors on master and slave. Error on master: message (format)='Cannot delete or update a parent row: a foreign key constraint fails' error code=1217 ; Error on slave: actual message='no error', error code=0. Default database: 'db1'. Query: 'DROP TABLE IF EXISTS `table1` /* generated by server */'

There were two fixes required for this bug.

If a DROP TABLE statement used to drop a single table fails, to avoid partial results causing this bug the query is not written to the binary log. If a DROP TABLE statement used to drop a list of tables fails, if it generates partial results they are written to the binary log with an error.

When a query that generates an error as expected was received by a slave but it was skipped due to replication filters, the slave was incorrectly checking the error. The fix for Bug #76493 ensures that this comparison of the expected error from the master with the actual error from the slave does not happen.
[2 Mar 2016 11:29] David Moss
Posted by developer:
 
Thanks for your feedback. This has been fixed in upcoming versions and the following was added to the 5.6.30 and 5.7.12 change logs:
If a query on a master generated an error and partial results were written to the binary log, for example due to a DROP TABLE IF EXISTS statement applying to multiple tables that would break foreign key constraints, when a slave configured with replication filters encountered the query it could be incorrectly binary logged. This caused errors such as:

Last_SQL_Error: Query caused different errors on master and slave. Error on master: message (format)='Cannot delete or update a parent row: a foreign key constraint fails' error code=1217 ; Error on slave: actual message='no error', error code=0. Default database: 'db1'. Query: 'DROP TABLE IF EXISTS `table1` /* generated by server */'

There were two fixes required for this bug.

If a DROP TABLE statement used to drop a single table fails, to avoid partial results causing this bug the query is not written to the binary log. If a DROP TABLE statement used to drop a list of tables fails, if it generates partial results they are written to the binary log with an error.

When a query that generates an error as expected was received by a slave but it was skipped due to replication filters, the slave was incorrectly checking the error. The fix for Bug #76493 ensures that this comparison of the expected error from the master with the actual error from the slave does not happen.