Bug #69811 | Unable to repair replication after Lost_Events incident using GTID | ||
---|---|---|---|
Submitted: | 22 Jul 2013 3:30 | Modified: | 24 Sep 2014 14:39 |
Reporter: | Warren Yates | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Documentation | Severity: | S2 (Serious) |
Version: | 5.6.11-ndb-7.3.2-cluster | OS: | Windows |
Assigned to: | David Moss | CPU Architecture: | Any |
Tags: | GTID, LOST_EVENTS, replication |
[22 Jul 2013 3:30]
Warren Yates
[22 Jul 2013 7:32]
Valeriy Kravchuk
This is too general problem report for now. Had you tried approaches mentioned in http://www.mysqlperformanceblog.com/2013/03/26/repair-mysql-5-6-gtid-replication-by-inject...? Any steps to reproduce your exact problem, output of show slave status\G etc?
[24 Jul 2013 0:06]
Warren Yates
Thanks Valeriy, Thanks for the pointers however as I stated in the original post I have tried to inject an empty transaction, in fact, I've tried injecting a number of them to see if there were multiple LOST_EVENTS causing the problem. Even when the Retrieved caught up with the Exicuted Gtid set the replication was still broken due to a LOST_EVENTS_INCEDENT. Unfortunalty all I can say to repolicate the problem is to produce a LOST_EVENTS_INCEIDENT and then try to repair by injecting empty transactions. Here is my SHOW_SLAVE_STATUS: mysql> show slave status \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.4.70.201 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: binlog.000006 Read_Master_Log_Pos: 36663990 Relay_Log_File: mysqld-relay-bin.000004 Relay_Log_Pos: 395 Relay_Master_Log_File: binlog.000004 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: acesmerp,acesmerp Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1590 Last_Error: The incident LOST_EVENTS occured on the master. M essage: mysqld startup Skip_Counter: 0 Exec_Master_Log_Pos: 191 Relay_Log_Space: 37947508 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1590 Last_SQL_Error: The incident LOST_EVENTS occured on the master. M essage: mysqld startup Replicate_Ignore_Server_Ids: Master_Server_Id: 101 Master_UUID: 56c22d30-e767-11e2-a5b6-02bf0a0446c8 Master_Info_File: E:\MySQL_Data\master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 130724 09:37:36 Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 56c22d30-e767-11e2-a5b6-02bf0a0446c8:1-2402 Executed_Gtid_Set: 56c22d30-e767-11e2-a5b6-02bf0a0446c8:1-1033 Auto_Position: 0 1 row in set (0.00 sec) mysql> SET GTID_NEXT="56c22d30-e767-11e2-a5b6-02bf0a0446c8:1034"; Query OK, 0 rows affected (0.00 sec) mysql> BEGIN; COMMIT; Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) mysql> SET GTID_NEXT="AUTOMATIC"; Query OK, 0 rows affected (0.00 sec) mysql> START SLAVE; Query OK, 0 rows affected (0.00 sec) mysql> SHOW SLAVE STATUS \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.4.70.201 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: binlog.000006 Read_Master_Log_Pos: 36663990 Relay_Log_File: mysqld-relay-bin.000004 Relay_Log_Pos: 395 Relay_Master_Log_File: binlog.000004 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: acesmerp,acesmerp Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1590 Last_Error: The incident LOST_EVENTS occured on the master. M essage: mysqld startup Skip_Counter: 0 Exec_Master_Log_Pos: 191 Relay_Log_Space: 37947913 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1590 Last_SQL_Error: The incident LOST_EVENTS occured on the master. M essage: mysqld startup Replicate_Ignore_Server_Ids: Master_Server_Id: 101 Master_UUID: 56c22d30-e767-11e2-a5b6-02bf0a0446c8 Master_Info_File: E:\MySQL_Data\master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 130724 09:56:57 Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 56c22d30-e767-11e2-a5b6-02bf0a0446c8:1-2402 Executed_Gtid_Set: 56c22d30-e767-11e2-a5b6-02bf0a0446c8:1-1034 Auto_Position: 0 1 row in set (0.00 sec)
[19 Mar 2014 1:54]
andrew lorien
This happened for me on a multi-master 5.6.13 Solaris server. After restoring the broken master from a backup, the slaves were still unable to get past the error. I had to restore them as well. You can create a LOST_EVENT incident by following the steps here : http://bugs.mysql.com/bug.php?id=68892
[8 Apr 2014 7:50]
andrew lorien
Workaround What didn't work : stop slave; start slave; reset slave; reset master; set master to... ; set gtid_next ... What did work : - restoring from a backup of the master (the one which caused the problem in the first place) - SET @@GLOBAL.GTID_PURGED=[one statement past the one which caused the lost_events error].
[2 Jul 2014 15:15]
MySQL Verification Team
The problem is that the LOST_EVENT does not have a GTID, so we can't replace it with an empty transaction. The workaround is to manually execute the next transaction, then reset replication. Use mysqlbinlog to find the next transaction. It is probably the first transaction in the next log file after the LOST_EVENT. Copy everything up to the COMMIT for that transaction, being sure to include the SET @@SESSION.GTID_NEXT. Even if you're using ROW format, you can still run BINLOG row events in the command line client. STOP SLAVE; Run the transaction copied above. The mysqlbinlog output sets the delimiter to /*!*/;, so set it back: DELIMITER ; SET GTID_NEXT=automatic; RESET SLAVE; START SLAVE;
[21 Jul 2014 17:51]
Sveta Smirnova
Thank you for the report. I set it to "Verified", because at least our documentation can be clearer.
[24 Sep 2014 14:39]
David Moss
The following has been added to the documentation at this page: http://dev.mysql.com/doc/refman/5.7/en/replication-gtids-failover.html Restoring GTID mode slaves. When restoring a slave in a GTID based replication setup that has encountered an error, injecting an empty transaction may not solve the problem because an event does not have a GTID. Use mysqlbinlog to find the next transaction, which is probably the first transaction in the next log file after the event. Copy everything up to the COMMIT for that transaction, being sure to include the SET @@SESSION.GTID_NEXT. Even if you are not using row-based replication, you can still run binary log row events in the command line client. Stop the slave and run the transaction you copied. The mysqlbinlog output sets the delimiter to /*!*/;, so set it back: mysql> DELIMITER ; Restart replication from the correct position automatically: mysql> SET GTID_NEXT=automatic; mysql> RESET SLAVE; mysql> START SLAVE;
[3 Mar 2017 20:22]
monty solomon
The text should be updated to reflect that mysqlbinlog should be used to find the next transaction in the relay logs (it says "next log file after the event"). The documentation suggests running the transaction copied from the output of mysqlbinlog but it does not warn about, or provide guidance to avoid, errant GTID transactions created by executing the copied transaction.