Description:
Hi,
I am classifying this report as a feature request as I suggest modifying MySQL behavior. However, it could be argued that this is a serious bug as the current implementation of MySQL can lead a novice user, or even an expert who is "forgetting" about this specific behavior, to corrupt data after restoring a backup.
The default value for replicate-same-server-id is OFF. Quoted from the manual ([1] and [2]), “Normally, this [setting replicate-same-server-id to ON] is useful only in rare configurations”. I disagree and both Lefred and I talked about this problematic behavior in [3] and [4]. The problem is with restoring a backup on a node that was a master since the backup was taken, and because of replicate-same-server-id = OFF, transactions will be skipped unless the server-id is changed (and this change can easily be forgotten, leading to data corruption and replication breakage).
[1]: https://dev.mysql.com/doc/refman/5.7/en/replication-options-slave.html#option_mysqld_repli...
[2]: https://dev.mysql.com/doc/refman/8.0/en/replication-options-slave.html#option_mysqld_repli...
[3]: https://lefred.be/content/mysql-gtid-restore-a-master-from-a-replicas-backup/
[4]: https://jfg-mysql.blogspot.com/2020/01/legacy-behavior-corrupting-restored-backups.html
The solution is not as simple as changing the default to ON because this is not possible in MySQL 5.7 with log-slave-updates or without GTID and with log-slave-updates in 8.0 (this setting cannot be combined with log-slave-updates or with GTID disabled in 5.7 and 8.0 respectively).
I would suggest the following for fixing this serious problem (could be implemented in many steps):
- In MySQL 8.0, when GTID is enabled, replicate-same-server-id should be automatically ON,
- Add a new option for replicate-same-server-id in which replication would stop with an error if a SQL Thread sees its own server-id (suggested name “Break”),
- Make this “Break” behavior the default,
- Allow replicate-same-server-id = ON and = Break even with log-slave-updates or without GTID,
- Make replicate-same-server-id dynamic.
This way, the default behavior in MySQL 8.0 with GTID becomes safe (improvement on the status quo). And eventually, the default setting (“Break”) becomes safe in all cases (maybe even in 5.7 if the fix is back ported there, which IMHO should be done because data corruption is bad), and people deploying cyclic replication can easily change it to OFF (no significant backward compatibility problems, just a replication breakage if this is forgotten). Restoring a backup as presented in [3] or [4] does not lead to silent data corruption (great improvements on current behavior). And when replication breaks because of this parameter, it can be investigated (a good error message would also help), and it can be fixed without restarting MySQL (operation flexibility).
Many thanks for looking into that and for making MySQL easier and safer to run,
JFG
How to repeat:
See description.
Suggested fix:
See description.