Bug #92109 Please make replication crash safe with GITD and less durable setting (bis).
Submitted: 21 Aug 2018 17:53 Modified: 7 May 2021 12:40
Reporter: Jean-François Gagné Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.7, 8.0 OS:Any
Assigned to: CPU Architecture:Any

[21 Aug 2018 17:53] Jean-François Gagné
Description:
Hi,

(This is similar to Bug#70659, but the solution suggested here only applies to 5.7 and 8.0.)

In verified Bug#70659, it is indicated that MySQL 5.6 is not replication crash safe with GTID and less durable settings (sync_binlog != 1 and/or and innodb_flush_log_at_trx_commit != 1).  In short (see the bug for reference) and after an OS (Linux) crash, there is a risk that InnoDB and the binary logs are out of sync.  As replication will be restarted according to the GTID state of the slave (in MySQL 5.6, this state is derived from the binary logs), resuming replication risks skipping transactions or running transactions in double.

MySQL 5.7 introduced the mysql.gtid_executed table for allowing running a slave with GTID, without binary logs or without log-slave-updates.  The page [1] from the manual describes how this table is updated and can be summarised to:

1) If binary logging is disabled (log_bin is OFF), or if log_slave_updates is disabled, the server stores the GTID belonging to each transaction together with the transaction in the table. 

2) If binary logging is enabled (log_bin is ON), whenever the binary log is rotated or the server is shut down, the server writes GTIDs for all transactions that were written into the previous binary log into the mysql.gtid_executed table.

[1]: https://dev.mysql.com/doc/refman/5.7/en/replication-gtids-concepts.html#replication-gtids-...

Note that for fixing Bug#75393 (documentation bug), 2) should read "If binary logging is enabled (log_bin is ON) and log_slave_updates is enabled, [...]".

The behavior describe in [1] is storing the GTID state of the slave in a correct and crash persisitent way if binary logs are disabled or log_slave_updates is disabled.  So in this case, replication is crash safe even with GTID and less durable settings (if relay_log_info is in a table and relay_log_recovey is enabled).

It looks like it would be so simple to make replication crash safe with GITD and less durable setting by updating the mysql.gtid_executed table after each transactions, even if binary logging and log_slave_updates are enabled.  After all, something very similar is already done for the mysql.slave_relay_log_info.  So the additional cost would be relatively small.  If this cost wants to be avoided, maybe updating the table only when sync_binlog != 1 and/or innodb_flush_log_at_trx_commit != 1 could be done.  Also, if this cost needs absolutely to be avoided, maybe a configuration option could be added to enable/disable updating the table after each transaction when binary logging and log_slave_updates are enabled.

Note: updating mysql.gtid_executed would not solve the problem that the binary logs might be out of sync with the database.  It would only allow to resume replication in a crash safe way.  As described by Yoshinori in Bug#70659, the binary logs can still be ahead or behind InnoDB and someone setting less durable setting needs to understand that.  Yoshinori's idea to truncate the binary logs if they are ahead is still interesting, but it is out of the scope of this bug report.  Also, Yoshinori's idea to store GTIDs in the InnoDB Redo logs is interesting for avoiding loosing the GTID position of a master after a crash and allowing re-slaving it to a newly promoted master, but it is also out of the scope of this bug report.  I will open other bugs about this and put the link in the comments.

Many thanks for looking into that and allowing us run replication with GTID and less durable settings without worrying about slave crashes,

JFG

How to repeat:
Crash a slave with GTID and with less durable setting.  See Bug#70659 for more details.

Suggested fix:
Update the mysql.gtid_executed table after each transactions to avoid loosing InnoDB GTID state.
[24 Aug 2018 14:17] MySQL Verification Team
Hi,
I'm switching this to "Feature request" as IMO behavior now is not a bug and as for the suggested changes, I'll let replication team decide how to go forward.

Thanks!
Bogdan
[24 Aug 2018 16:29] Jean-François Gagné
Hi Bogdan, thanks for verifying.

Allow myself to try changing your mind: I actually think this is a bug.

Replication not being crash safe is causing data corruption, and data consistency is not something that can be a "feature" of a database.

Also, Bug#70659 is flagged as a bug (not a feature request), so I also think this should be a bug.

So I am setting it back to S2, sorry for fighting you on this.  If you still think it is a feature request and set it back to S4, I will not fight back.  :-)

Cheers, JFG
[24 Aug 2018 17:06] MySQL Verification Team
Hi,
Well I'm torn to be honest. If the 70659 did not exist I'd set this as bug, if those doc one did not exist same thing but with them all classified as bug I'm looking at this one as "potential way to change behavior" (that solves it) rather then final solution.. but in any way the classification is not too important as verifying it makes it go to replication team and then they make decision themselves how to proceed :)

all best
Bogdan
[20 Apr 2020 14:31] Jo Goossens
Just reading all the articles of Jean-François and I think the solution is fairly easy and an important fix.

Any idea about the current status of implementation?
[5 Dec 2020 2:05] Jean-François Gagné
While doing tests for Bug#101874, I tested replication crash safety with GTID, sync_binlog = 0 and innodb_flush_log_at_trx_commit = 2 in a recent MySQL 8.0, and I did not get replication breakage.  I suspect that replication crash safety with GTID and less durable settings was introduced with WL#9211 in MySQL 8.0.17.  Could we get an official statement from Oracle on this subject ?
[6 May 2021 14:45] Joe Grasse
It looks like that is true, based on this section of the manual.

If binary logging is enabled (log_bin is ON), from MySQL 8.0.17 for the InnoDB storage engine only, the server updates the mysql.gtid_executed table in the same way as when binary logging or replica update logging is disabled, storing the GTID for each transaction at transaction commit time. 

https://dev.mysql.com/doc/refman/8.0/en/replication-gtids-concepts.html
[7 May 2021 12:40] Jean-François Gagné
Thanks Joe for sharing your findings.  So it looks like this bug is fixed in MySQL 8.0.17.

To Oracle: would you consider back-porting this fix in MySQL 5.7 ?
[18 May 2021 16:22] Joe Grasse
So would this allow a crash safe replication setup (A -> B -> C) with setting sync_binlog = 0 on A, B, and C on MySQL >= 8.0.17?
[15 Jun 2021 7:40] Prasad N
Clarification on the scenario
[15 Jun 2021 7:48] Prasad N
I wanted a clarification on this scenario and fix.
So when non-durable settings (sync_binlog != 1 and Innofb_flush_log_at_trx_commit ! =1) are used, we know that innodb log file and binlog file can be out of sync and due to this, data divergence can happen during crash recovery.
How does the fix solve the above problem? Is the crash recovery approach changed and will it not be based on the GTID/transaction info info present in the binary logs ?