Bug #72635 | data inconsistencies when master has truncated binary log with GTID after crash | ||
---|---|---|---|
Submitted: | 13 May 2014 18:24 | Modified: | 8 Dec 2014 15:34 |
Reporter: | Santosh Praneeth Banda | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.6.16, 5.6.17 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[13 May 2014 18:24]
Santosh Praneeth Banda
[13 May 2014 18:24]
Santosh Praneeth Banda
Updating severity level
[19 May 2014 9:20]
MySQL Verification Team
Hello Santosh, Thank you for the bug report. Verified as described. Thanks, Umesh
[19 May 2014 9:24]
MySQL Verification Team
// Master/Slave with MySQL version 5.6.17 With GTID enabled - None issue reported(Slave up, even syncing new data) but observed data inconsistencies(lost those events which were truncated during crash) Without GTID enabled - Slave's IO thread stopped with: Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from position > file size; the first event 'master-bin.000003' at 3382, the last event read from './master-bin.000003' at 4, the last byte read from './master-bin.000003' at 4.'
[8 Dec 2014 15:34]
David Moss
Thanks for your feedback. The following was added to the 5.6.23 and 5.7.6 changelog with commit 4747: In normal usage, it is not possible for a slave to have more GTIDs than the master. But in certain situations, such as after a hardware failure or incorrectly cleared gtid_purged, the master's binary log could be truncated. This fix ensures that in such a situation, the master now detects that the slave has transactions with GTIDs which are not on the master. An error is now generated on the slave and the I/O thread is stopped with an error. The master's dump thread is also stopped. This prevents data inconsistencies during replication.
[12 Feb 2015 12:47]
Laurynas Biveinis
$ git show -s 6e6add6 commit 6e6add6bb5649b6f75579c86f5a4a51e95c54fb6 Author: Venkatesh Duggirala <venkatesh.duggirala@oracle.com> Date: Tue Nov 18 09:54:31 2014 +0530 Bug #18789758 DATA INCONSISTENCIES WHEN MASTER HAS TRUNCATED BINARY LOG WITH GTID AFTER CRASH Problem: Master's dump thread is not detecting the case where Slave's gtid executed set is having more gtids than Master's gtid executed set with respective to Master's UUID. Analysis & Fix: In normal scenarios, it is not possible that Slave will contain more gtids than Master with respective to Master's UUID. But it could be possible case if Master's binary log is truncated(due to raid failure) or Master's binary log is deleted but GTID_PURGED was not set properly. That scenario needs to be validated, i.e., it should *always* be the case that Slave's gtid executed set (+retrieved set) is a subset of Master's gtid executed set with respective to Master's UUID. If it happens, Master's dump thread will be stopped and this situation will be informed to Slave during the handshake (thus. slave I/O thread also be stopped with an error (ER_MASTER_FATAL_ERROR_READING_BINLOG). Otherwise, it can lead to data inconsistency between Master and Slave.