Bug #98408 Make replicate-same-server-id behavior flexible and safe by default.
Submitted: 27 Jan 2020 7:42 Modified: 27 Jan 2020 9:10
Reporter: Jean-François Gagné Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S4 (Feature request)
Version:5.6, 5.7, 8.0 OS:Any
Assigned to: CPU Architecture:Any

[27 Jan 2020 7:42] Jean-François Gagné
Description:
Hi,

I am classifying this report as a feature request as I suggest modifying MySQL behavior.  However, it could be argued that this is a serious bug as the current implementation of MySQL can lead a novice user, or even an expert who is "forgetting" about this specific behavior, to corrupt data after restoring a backup.

The default value for replicate-same-server-id is OFF.  Quoted from the manual ([1] and [2]), “Normally, this [setting replicate-same-server-id to ON] is useful only in rare configurations”.  I disagree and both Lefred and I talked about this problematic behavior in [3] and [4].  The problem is with restoring a backup on a node that was a master since the backup was taken, and because of replicate-same-server-id = OFF, transactions will be skipped unless the server-id is changed (and this change can easily be forgotten, leading to data corruption and replication breakage).

[1]: https://dev.mysql.com/doc/refman/5.7/en/replication-options-slave.html#option_mysqld_repli...

[2]: https://dev.mysql.com/doc/refman/8.0/en/replication-options-slave.html#option_mysqld_repli...

[3]: https://lefred.be/content/mysql-gtid-restore-a-master-from-a-replicas-backup/

[4]: https://jfg-mysql.blogspot.com/2020/01/legacy-behavior-corrupting-restored-backups.html

The solution is not as simple as changing the default to ON because this is not possible in MySQL 5.7 with log-slave-updates or without GTID and with log-slave-updates in 8.0 (this setting cannot be combined with log-slave-updates or with GTID disabled in 5.7 and 8.0 respectively).

I would suggest the following for fixing this serious problem (could be implemented in many steps):

- In MySQL 8.0, when GTID is enabled, replicate-same-server-id should be automatically ON,
- Add a new option for replicate-same-server-id in which replication would stop with an error if a SQL Thread sees its own server-id (suggested name “Break”),
- Make this “Break” behavior the default,
- Allow replicate-same-server-id = ON and = Break even with log-slave-updates or without GTID,
- Make replicate-same-server-id dynamic.

This way, the default behavior in MySQL 8.0 with GTID becomes safe (improvement on the status quo).  And eventually, the default setting (“Break”) becomes safe in all cases (maybe even in 5.7 if the fix is back ported there, which IMHO should be done because data corruption is bad), and people deploying cyclic replication can easily change it to OFF (no significant backward compatibility problems, just a replication breakage if this is forgotten).  Restoring a backup as presented in [3] or [4] does not lead to silent data corruption (great improvements on current behavior).  And when replication breaks because of this parameter, it can be investigated (a good error message would also help), and it can be fixed without restarting MySQL (operation flexibility).

Many thanks for looking into that and for making MySQL easier and safer to run,

JFG

How to repeat:
See description.

Suggested fix:
See description.
[27 Jan 2020 9:10] MySQL Verification Team
Hello Jean-François,

Thank you for the feature request!

regards,
Umesh