Bug #66260 | Massive data loss when changing master_port in a master/master setup | ||
---|---|---|---|
Submitted: | 8 Aug 2012 13:03 | Modified: | 14 May 2013 19:12 |
Reporter: | Matija Grabnar | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.5.27 | OS: | Linux (Ubuntu 12.04) |
Assigned to: | CPU Architecture: | Any | |
Tags: | dataloss, innodb, regression, replication |
[8 Aug 2012 13:03]
Matija Grabnar
[8 Aug 2012 18:16]
Sveta Smirnova
Thank you for the report. Verified as described. Problem is not repeatable with version 5.1
[9 Aug 2012 6:34]
MySQL Verification Team
seems like a documented scenario. not that I agree with it.... http://dev.mysql.com/doc/refman/5.5/en/change-master-to.html "If you specify the MASTER_HOST or MASTER_PORT option, the slave assumes that the master server is different from before (even if the option value is the same as its current value.) In this case, the old values for the master binary log file name and position are considered no longer applicable, so if you do not specify MASTER_LOG_FILE and MASTER_LOG_POS in the statement, MASTER_LOG_FILE='' and MASTER_LOG_POS=4 are silently appended to it."
[9 Aug 2012 6:46]
Matija Grabnar
OK, I can in some ways understand why changing the port would reset the master_log_pos and master_log_file values. However, I still don't understand from that why data written to the master would suddenly disappear. We lost a day's worth of data that was committed and written to disk when this happened on our production server. We recovered about 80% of the data from the backup (done with mysqldump) so we know that the data was there - but then it was gone. Note, the data that was deleted was NOT the data that came in through replication, it was exactly the opposite: the data that was written to the server by clients was gone. It's as if when the server (id:10 in the above example) first became a master, a checkpoint was made, and when the change of port on id:20 was made, the id:10 server undid everything it did since it started logging for replication. Even if I switched it to a completely new master, I would not have expected that to happen.
[13 May 2013 6:04]
Sujatha Sivakumar
Analysis: ----------- We are able to see the data loss, but this is not a bug. This behaviour is expected for the following reason. The table `t1` on `master2` is created using a dump file. For `mysqldump` tool "--add-drop-table=On (by default)". And the dump file contains a `drop table t1` followed by `create table t1`. `master2` logs these DDL's in its binary log file as if they are originated from `master2`(self). When the change master is executed on `master1` with new `master2`s port id, `master1` thinks that it is a new server and it will append master_log_file='', and master_log_pos=4 (silently). Now `master1` will start applying current relay log file starting from position 4. Since the above mentioned DDLs are present in the current relay log file and they are originated from master2 (due to the dump file usage), `master1` will start applying these DDLs i.e., Drop and (re)create. Hence we see the data loss i.e., we will see only the data which has come through the dump + any additional changes that happened directly on master2 Suggestions to avoid above problem: 1) While changing the master's port, if user knows that it is the same server, specify the exact `master_log_pos` and `master_log_file`. 2) On `master 1` specify "--skip-add-drop-table" option while taking dump, so that DROP TABLE statement will not be added before each CREATE TABLE statement 3) The lost data can be obtained from `master 2` in circular replication. 4) Do not use dump file to create the initial setup on `master2`, let it happen through normal master-slave replication, which might be a time taking process. 5) If the dump is already taken without "--skip-add-drop-table", set session binlog variable to OFF, before loading the dump on master2.