Bug #80102 | Message in log after MTS crash misleading. | ||
---|---|---|---|
Submitted: | 21 Jan 2016 20:08 | Modified: | 6 Jun 2016 7:29 |
Reporter: | Jean-François Gagné | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
Version: | 5.6.28, 5.7.10, 5.7.12 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[21 Jan 2016 20:08]
Jean-François Gagné
[8 Feb 2016 14:11]
Andrei Elkin
In case MTS session left any gaps, requirement to fill them as specified in the error message indeed can be infeasible, as the bug describes. Yet --relay-log-recovery=1, as binlog-position-based recovery, could be elaborated to work with MTS recovery. It just needs a recovery submode that will make the slave to resume reading from the master (by IO thread) from so called low-water-mark execution position that is kept recorded by the slave applier (Coordinator thread plus Workers). MTS recovery would receive not so much significant changes to ignore the pre-crash relay-log coordinates altogether in gaps computing.
[19 May 2016 7:32]
MySQL Verification Team
Hello Jean, Thank you for the report and feedback! Thanks, Umesh
[6 Jun 2016 7:29]
Sujatha Sivakumar
Solution for the above issue has been implemented as part of https://bugs.mysql.com/bug.php?id=77496 fix. Fix is available in MySQL versions 5.6.31 and 5.7.13. If a multi-threaded replication slave running with relay_log_recovery=1 stopped unexpectedly, during restart the relay log recovery process could fail. This was due to transaction inconsistencies not being filled, see Handling an Unexpected Halt of a Replication Slave. Prior to this fix, to recover from this situation required manually setting relay_log_recovery=0, starting the slave with START SLAVE UNTIL SQL_AFTER_MTS_GAPS to fix any transaction inconsistencies and then restarting the slave with relay_log_recovery=1. This process has now been automated, enabling relay log recovery of a multi-threaded slave upon restart automatically. The above mentioned error message has been removed now. Hence closing this bug as fixed.