Bug #84375 | Changing MASTER_DELAY to 0 purges the relay log, causing corruption. | ||
---|---|---|---|
Submitted: | 30 Dec 2016 10:27 | Modified: | 2 Aug 2017 14:26 |
Reporter: | Saverio Miroddi | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Documentation | Severity: | S1 (Critical) |
Version: | 5.7.17 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[30 Dec 2016 10:27]
Saverio Miroddi
[30 Dec 2016 15:33]
MySQL Verification Team
Thank you for the bug report. Please first check with the latest MySQL released binary 5.7.17 and comment here if you get the same issue. Thanks.
[3 Jan 2017 8:01]
Saverio Miroddi
For reference, changed the version to 5.7.17 (Community Edition).
[3 Jan 2017 8:02]
Saverio Miroddi
Proposing the severity to Critical, and in certain setups, this bug will lead to catastrophic results.
[3 Jan 2017 8:05]
Saverio Miroddi
Reproduced on MySQL Community edition 5.7.17. It's very easily reproducible: 1. set a large enough delay 2. create a table on the master 3. make sure the update is replicated to the relay log 4. reset the delay 5. insert a record in the table on the master 6. slave will fail the replication. based on my test, the slave will reset the relay log on 4., which contains the table creation statement. 5. will therefore cause the slave to fail the replication, since it (the slave) won't replay the creation statement, but only the insertion.
[3 Jan 2017 10:54]
MySQL Verification Team
>>based on my test, the slave will reset the relay log on 4., which contains the table creation statement. 5. will therefore cause the slave to fail the replication, since it (the slave) won't replay the creation statement, but only the insertion. Prior to MySQL 5.7.4, CHANGE MASTER TO deletes all relay log files and starts a new one, unless you specify RELAY_LOG_FILE or RELAY_LOG_POS. In that case, relay log files are kept; the relay_log_purge global variable is set silently to 0. In MySQL 5.7.4 and later, relay logs are preserved when neither the slave SQL thread nor the slave I/O thread is stopped; if both threads are stopped, all relay log files are deleted unless you at least one of RELAY_LOG_FILE or RELAY_LOG_POS is specified. In your case, after step 4 - new relay log is generated(i.e slave server creates a new relay log file each time the I/O thread starts). So instead of STOP SLAVE; CHANGE MASTER TO MASTER_DELAY=<value>; START SLAVE; it is enough to do the following with this feature. STOP SLAVE SQL_THREAD; CHANGE MASTER TO MASTER_DELAY=<value>; START SLAVE SQL_THREAD; Please see more details on this here http://mysqlserverteam.com/mysql-5-7-4-change-master-without-stopping-slave-altogether/ For example: -- relay log is rotated WITH STOP SLAVE root@localhost [(none)]> \! ls -l slave/hod03-relay-bin* -rw-r----- 1 umshastr common 177 Jan 3 11:48 slave/hod03-relay-bin.000001 -rw-r----- 1 umshastr common 25 Jan 3 11:48 slave/hod03-relay-bin.index root@localhost [(none)]> root@localhost [(none)]> STOP SLAVE; Query OK, 0 rows affected, 1 warning (0.00 sec) root@localhost [(none)]> CHANGE MASTER TO MASTER_DELAY=900; Query OK, 0 rows affected (0.00 sec) root@localhost [(none)]> START SLAVE; Query OK, 0 rows affected (0.01 sec) root@localhost [(none)]> SHOW SLAVE STATUS\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: localhost Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000001 Read_Master_Log_Pos: 5551 Relay_Log_File: hod03-relay-bin.000002 Relay_Log_Pos: 5766 Relay_Master_Log_File: master-bin.000001 Slave_IO_Running: Yes . root@localhost [(none)]> \! ls -l slave/hod03-relay-bin* -rw-r----- 1 umshastr common 207 Jan 3 11:49 slave/hod03-relay-bin.000001 -rw-r----- 1 umshastr common 5766 Jan 3 11:49 slave/hod03-relay-bin.000002 -rw-r----- 1 umshastr common 50 Jan 3 11:49 slave/hod03-relay-bin.index -- relay log is not rotated WITH STOP SLAVE root@localhost [(none)]> \! ls -l slave/hod03-relay-bin* -rw-r----- 1 umshastr common 207 Jan 3 11:49 slave/hod03-relay-bin.000001 -rw-r----- 1 umshastr common 5766 Jan 3 11:49 slave/hod03-relay-bin.000002 -rw-r----- 1 umshastr common 50 Jan 3 11:49 slave/hod03-relay-bin.index root@localhost [(none)]> STOP SLAVE SQL_THREAD; Query OK, 0 rows affected (0.00 sec) root@localhost [(none)]> CHANGE MASTER TO MASTER_DELAY=0; Query OK, 0 rows affected (0.00 sec) root@localhost [(none)]> START SLAVE SQL_THREAD; Query OK, 0 rows affected (0.00 sec) root@localhost [(none)]> \! ls -l slave/hod03-relay-bin* -rw-r----- 1 umshastr common 207 Jan 3 11:49 slave/hod03-relay-bin.000001 -rw-r----- 1 umshastr common 5766 Jan 3 11:49 slave/hod03-relay-bin.000002 -rw-r----- 1 umshastr common 50 Jan 3 11:49 slave/hod03-relay-bin.index root@localhost [(none)]> Could you please check this at your end? Thanks, Umesh
[3 Jan 2017 11:19]
Saverio Miroddi
OK, this works, thanks - not an issue, then.
[3 Jan 2017 11:21]
Saverio Miroddi
I suggest, for clarity, to add a brief explanation to the "Delayed Replication" chapter (https://dev.mysql.com/doc/refman/5.7/en/replication-delayed.html).
[3 Jan 2017 12:37]
MySQL Verification Team
Thank you for the feedback!
[2 Mar 2017 10:42]
Daniƫl van Eeden
Looks like this is a duplicate of: Bug #84249 CHANGE MASTER TO MASTER_DELAY can lead a slave to skip transactions
[24 Jul 2017 19:18]
Shlomi Noach
Affected by same bug.
[2 Aug 2017 14:26]
Margaret Fisher
Posted by developer: Confirmed with development that the problem behavior is now removed by the fix for Changing master_delay after stop slave results in loss of events https://bugs.mysql.com/bug.php?id=81232 which was implemented in the 5.7.19 and 8.0.2 releases. The fix means that issuing CHANGE MASTER TO does not purge the relay log even if both threads are stopped. Added a "See also" reference to the changelog entry for that bug.