Description:
Hi,
while preparing my talk for the MinervaDB Athena 2020 Conference [1], I was reviewing old bugs and the way they were fixed. I should have done that before, sorry for the delay. I am interested in [2] and [3]: Handling an Unexpected Halt of a Replica in 5.7 and 8.0 respectively.
[1]: https://minervadbathena.com/
[2]: https://dev.mysql.com/doc/refman/5.7/en/replication-solutions-unexpected-replica-halt.html
[3]: https://dev.mysql.com/doc/refman/8.0/en/replication-solutions-unexpected-replica-halt.html
Beside that sync_binlog = 1 is missing in [2] for GTID with log-slave-updates and and [3] for up to 8.0.16 (I reported this in Bug#101874), I believe that [2] and [3] are overly conservative in the parameters they. specify for replication crash safety. IMHO, the following are not needed in all cases:
- MASTER_AUTO_POSITION = 1
- sync_relay_log = 1
- innodb_flush_log_at_trx_commit = 1
- and the missing sync_binlog = 1
I would suggest restructuring this document to separate parameters needed all the time and parameters needed only in some situations. The parameters needed all the time are, according to my understanding, the following:
- relay_log_info_repository = TABLE
- relay_log_recovery = ON
And the parameters only needed in some situations are:
- MASTER_AUTO_POSITION = 1
- sync_relay_log = 1
- innodb_flush_log_at_trx_commit = 1
- sync_binlog = 1
I give details in Suggested fix on when I think the 4 parameters above are needed.
Many thanks for looking into this,
Jean-François Gagné
How to repeat:
Does not apply to this documentation bug.
Suggested fix:
As I write in the description, I think the following parameters are always needed for replication crash safety:
- relay_log_info_repository = TABLE
- relay_log_recovery = ON
And that these are only needed in some situations:
- MASTER_AUTO_POSITION = 1
- sync_relay_log = 1
- innodb_flush_log_at_trx_commit = 1
- sync_binlog = 1
Below is my understanding of each.
MASTER_AUTO_POSITION = 1
------------------------
This is needed when gtid_mode = ON and parallel replication is used (slave_parallel_workers > 1) with out-of-order committing. Out-of-order committing can happen either when slave_parallel_type = DATABASE, or when slave_parallel_type = LOGICAL_CLOCK and slave_preserve_commit_order = OFF.
sync_relay_log = 1
------------------
This is needed when gtid_mode = OFF and parallel replication is used (slave_parallel_workers > 1) with out-of-order committing. Out-of-order committing can happen either when slave_parallel_type = DATABASE, or when slave_parallel_type = LOGICAL_CLOCK and slave_preserve_commit_order = OFF.
innodb_flush_log_at_trx_commit = 1 and sync_binlog = 1
------------------------------------------------------
These should be set when gtid_mode = ON and log-slave-updates is ON (except maybe from 8.0.17 because of WL#9211).