Bug #73032 | setting gtid_purged may break auto_position and thus slaves | ||
---|---|---|---|
Submitted: | 18 Jun 2014 9:54 | Modified: | 3 Nov 2014 9:34 |
Reporter: | Santosh Praneeth Banda | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.6.19 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[18 Jun 2014 9:54]
Santosh Praneeth Banda
[18 Jun 2014 13:50]
MySQL Verification Team
Hello Santosh, Thank you for the report. Verified as described. Thanks, Umesh
[18 Jun 2014 13:51]
MySQL Verification Team
// How to repeat // Setting Up Replication Using GTIDs - http://dev.mysql.com/doc/refman/5.6/en/replication-gtids-howto.html Master> bin/mysqld_safe --defaults-file=./73032M.cnf --gtid_mode=ON --log-bin --log-slave-updates --enforce-gtid-consistency --user=root & Slave> bin/mysqld_safe --defaults-file=./73032S.cnf --gtid_mode=ON --log-bin --log-slave-updates --enforce-gtid-consistency --skip_slave_start --user=root & // No transaction should be executed on master in order to set gtid_purged = 'UUID:1-1000', else most likely to see ERROR 1840 (HY000) // Master mysql> show master status; +-------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +-------------------+----------+--------------+------------------+-------------------+ | master-bin.000003 | 151 | | | | +-------------------+----------+--------------+------------------+-------------------+ 1 row in set (0.00 sec) mysql> show variables like 'server_uuid'; +---------------+--------------------------------------+ | Variable_name | Value | +---------------+--------------------------------------+ | server_uuid | 7edc7076-f82f-11e3-953c-00163e44510c | +---------------+--------------------------------------+ 1 row in set (0.01 sec) mysql> show master logs; +-------------------+-----------+ | Log_name | File_size | +-------------------+-----------+ | master-bin.000001 | 65294 | | master-bin.000002 | 1071072 | | master-bin.000003 | 151 | +-------------------+-----------+ 3 rows in set (0.00 sec) mysql> set global gtid_purged = '7edc7076-f82f-11e3-953c-00163e44510c:1-1000'; Query OK, 0 rows affected (0.02 sec) mysql> show master logs; +-------------------+-----------+ | Log_name | File_size | +-------------------+-----------+ | master-bin.000001 | 65294 | | master-bin.000002 | 1071072 | | master-bin.000003 | 199 | | master-bin.000004 | 191 | <-------------- new binlog created +-------------------+-----------+ 4 rows in set (0.00 sec) // Slave mysql> show variables like 'server_uuid'; +---------------+--------------------------------------+ | Variable_name | Value | +---------------+--------------------------------------+ | server_uuid | 9e19bc8f-f82f-11e3-953d-00163e44510c | +---------------+--------------------------------------+ 1 row in set (0.00 sec) mysql> show master logs; +------------------+-----------+ | Log_name | File_size | +------------------+-----------+ | slave-bin.000001 | 151 | | slave-bin.000002 | 69327 | | slave-bin.000003 | 1237646 | | slave-bin.000004 | 151 | +------------------+-----------+ 4 rows in set (0.00 sec) mysql> set global gtid_purged = '9e19bc8f-f82f-11e3-953d-00163e44510c:1-100'; Query OK, 0 rows affected (0.01 sec) mysql> change master to master_host='127.0.0.1', master_port=3306, master_user='repl',MASTER_PASSWORD='slavepass', master_auto_position=1; Query OK, 0 rows affected, 2 warnings (0.07 sec) mysql> start slave; Query OK, 0 rows affected (0.00 sec) // from error log 2014-06-20 09:40:48 28051 [Note] Slave I/O thread: connected to master 'repl@127.0.0.1:3306',replication started in log 'FIRST' at position 4
[18 Jun 2014 13:51]
MySQL Verification Team
// Master - extract master's binary logs to see NULL vs real UUID [root@cluster-repo mysql-advanced-5.6.19]# bin/mysqlbinlog /tmp/master/master-bin.000003 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; /*!40019 SET @@session.max_insert_delayed_threads=0*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 4 #140620 9:31:02 server id 1 end_log_pos 120 CRC32 0xfa97961d Start: binlog v 4, server v 5.6.19-enterprise-commercial-advanced-log created 140620 9:31:02 at startup ROLLBACK/*!*/; BINLOG ' /rGjUw8BAAAAdAAAAHgAAAAAAAQANS42LjE5LWVudGVycHJpc2UtY29tbWVyY2lhbC1hZHZhbmNl ZC1sb2cAAAAAAAAAAAD+saNTEzgNAAgAEgAEBAQEEgAAXAAEGggAAAAICAgCAAAACgoKGRkAAR2W l/o= '/*!*/; # at 120 #140620 9:31:02 server id 1 end_log_pos 151 CRC32 0x384e6fe1 Previous-GTIDs # [empty] <------------------------------------------------------------------------------------------------NULL # at 151 #140620 9:37:20 server id 1 end_log_pos 199 CRC32 0xc3269bf2 Rotate to master-bin.000004 pos: 4 DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/; [root@cluster-repo mysql-advanced-5.6.19]# bin/mysqlbinlog /tmp/master/master-bin.000004 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; /*!40019 SET @@session.max_insert_delayed_threads=0*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 4 #140620 9:37:20 server id 1 end_log_pos 120 CRC32 0x2422f67d Start: binlog v 4, server v 5.6.19-enterprise-commercial-advanced-log created 140620 9:37:20 # Warning: this binlog is either in use or was not closed properly. BINLOG ' eLOjUw8BAAAAdAAAAHgAAAABAAQANS42LjE5LWVudGVycHJpc2UtY29tbWVyY2lhbC1hZHZhbmNl ZC1sb2cAAAAAAAAAAAAAAAAAEzgNAAgAEgAEBAQEEgAAXAAEGggAAAAICAgCAAAACgoKGRkAAX32 IiQ= '/*!*/; # at 120 #140620 9:37:20 server id 1 end_log_pos 191 CRC32 0xe7ddbf75 Previous-GTIDs # 7edc7076-f82f-11e3-953c-00163e44510c:1-1000 <------------------------------------------------------------NOT NULL # at 191 #140620 9:39:16 server id 1 end_log_pos 239 CRC32 0x5e2569b4 GTID [commit=yes] SET @@SESSION.GTID_NEXT= '7edc7076-f82f-11e3-953c-00163e44510c:1001'/*!*/; # at 239 #140620 9:39:16 server id 1 end_log_pos 407 CRC32 0x1cccd7f4 Query thread_id=1 exec_time=0 error_code=0 SET TIMESTAMP=1403237356/*!*/; SET @@session.pseudo_thread_id=1/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1073741824/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C utf8 *//*!*/; SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; CREATE USER 'repl'@'localhost' IDENTIFIED BY PASSWORD '*809534247D21AC735802078139D8A854F45C31F3' /*!*/; # at 407 #140620 9:39:16 server id 1 end_log_pos 455 CRC32 0x9c79d7b6 GTID [commit=yes] SET @@SESSION.GTID_NEXT= '7edc7076-f82f-11e3-953c-00163e44510c:1002'/*!*/; # at 455 #140620 9:39:16 server id 1 end_log_pos 594 CRC32 0xd2d25fef Query thread_id=1 exec_time=0 error_code=0 SET TIMESTAMP=1403237356/*!*/; GRANT REPLICATION SLAVE ON *.* TO 'repl'@'localhost' /*!*/; # at 594 #140620 9:39:16 server id 1 end_log_pos 642 CRC32 0x2e128167 GTID [commit=yes] SET @@SESSION.GTID_NEXT= '7edc7076-f82f-11e3-953c-00163e44510c:1003'/*!*/; # at 642 #140620 9:39:16 server id 1 end_log_pos 810 CRC32 0xb465858b Query thread_id=1 exec_time=0 error_code=0 SET TIMESTAMP=1403237356/*!*/; CREATE USER 'repl'@'127.0.0.1' IDENTIFIED BY PASSWORD '*809534247D21AC735802078139D8A854F45C31F3' /*!*/; # at 810 #140620 9:39:16 server id 1 end_log_pos 858 CRC32 0x90dee9a8 GTID [commit=yes] SET @@SESSION.GTID_NEXT= '7edc7076-f82f-11e3-953c-00163e44510c:1004'/*!*/; # at 858 #140620 9:39:16 server id 1 end_log_pos 997 CRC32 0x90cb2489 Query thread_id=1 exec_time=0 error_code=0 SET TIMESTAMP=1403237356/*!*/; GRANT REPLICATION SLAVE ON *.* TO 'repl'@'127.0.0.1' /*!*/; # at 997 #140620 9:39:17 server id 1 end_log_pos 1045 CRC32 0x93ee9c11 GTID [commit=yes] SET @@SESSION.GTID_NEXT= '7edc7076-f82f-11e3-953c-00163e44510c:1005'/*!*/; # at 1045 #140620 9:39:17 server id 1 end_log_pos 1124 CRC32 0x06a8e867 Query thread_id=1 exec_time=0 error_code=0 SET TIMESTAMP=1403237357/*!*/; FLUSH PRIVILEGES /*!*/; DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/; [root@cluster-repo mysql-advanced-5.6.19]#
[18 Jun 2014 14:00]
MySQL Verification Team
// Second case - On slave - when tried using set global gtid_purged = 'MASTER_UUID:1-100'; mysql> show master logs; +-------------------+-----------+ | Log_name | File_size | +-------------------+-----------+ | master-bin.000001 | 65294 | | master-bin.000002 | 1071072 | | master-bin.000003 | 151 | +-------------------+-----------+ 3 rows in set (0.00 sec) mysql> show variables like 'server_uuid'; +---------------+--------------------------------------+ | Variable_name | Value | +---------------+--------------------------------------+ | server_uuid | 84f959cc-f834-11e3-955d-00163e44510c | +---------------+--------------------------------------+ 1 row in set (0.00 sec) mysql> set global gtid_purged = '84f959cc-f834-11e3-955d-00163e44510c:1-1000'; Query OK, 0 rows affected (0.01 sec) mysql> show master logs; +-------------------+-----------+ | Log_name | File_size | +-------------------+-----------+ | master-bin.000001 | 65294 | | master-bin.000002 | 1071072 | | master-bin.000003 | 199 | | master-bin.000004 | 191 | +-------------------+-----------+ 4 rows in set (0.00 sec) // Slave mysql> set global gtid_purged = '84f959cc-f834-11e3-955d-00163e44510c:1-100'; Query OK, 0 rows affected (0.02 sec) mysql> change master to master_host='127.0.0.1', master_port=3306, master_user='repl',MASTER_PASSWORD='slavepass', master_auto_position=1; Query OK, 0 rows affected, 2 warnings (0.04 sec) mysql> start slave; Query OK, 0 rows affected (0.00 sec) mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000004 Read_Master_Log_Pos: 1124 Relay_Log_File: slave-relay-bin.000004 Relay_Log_Pos: 1296 Relay_Master_Log_File: master-bin.000004 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 1124 Relay_Log_Space: 1548 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: 84f959cc-f834-11e3-955d-00163e44510c Master_Info_File: /tmp/slave/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 84f959cc-f834-11e3-955d-00163e44510c:1001-1005 Executed_Gtid_Set: 84f959cc-f834-11e3-955d-00163e44510c:1-100:1001-1005 Auto_Position: 1 1 row in set (0.00 sec)
[3 Nov 2014 9:34]
David Moss
Thank you for your feedback. The following was added to the 5.6.22 and 5.7.6 release notes with commit 4560: When using GTIDs for replication and with MASTER_AUTO_POSITION enabled, if a slave requested GTIDs which had been already been purged by the master, the master was sending all available GTIDs. This happened because the master reads all available binary logs and searches for a binary log which contains a GTID that is not contained in the union of gtid_executed and gtid_retrieved. If such a GTID is found, the master starts sending the information starting from that location. In a situation where the union of the slave's gtid_executed and gtid_retreived set did not contain the master's gtid_purged set, the slave would expect GTIDs which had already been purged by the master. This fix ensures that in such a situation, the slave's I/O thread is aborted with an error "Master has purged binary logs containing GTIDs that the slave requires.".
[10 Dec 2014 14:11]
Laurynas Biveinis
$ bzr log -r 6211 ------------------------------------------------------------ revno: 6211 committer: Venkatesh Duggirala<venkatesh.duggirala@oracle.com> branch nick: mysql-5.6 timestamp: Mon 2014-10-20 15:11:32 +0530 message: Bug#19012085 SETTING GTID_PURGED MAY BREAK AUTO_POSITION AND THUS SLAVES Problem: When slave asks for GTIDs which were purged by Master, Master is silently ignoring it and sending available GTIDs Analysis: In auto_postion=1 protocol, when I/O thread is connecting, dump thread receives slave's current available gtid sets (gtid_executed + gtid_retrieved) information. Master reads all available binary logs and search for a binary log which contains a gtid which is not part of this list and starts sending the information starting from that location. But it is not doing any preliminary validation on the gtid sets. In the current bug scenario, Master has already purged few gtids which Slave does not have them. Master is silently ignoring this scenario and sending the gtids what ever is available on it. This will make Slave to have some gaps in gtid sets which in turn will cause many problems like inconsistency. Setting GTID_PURGED (when GTID_EXECUTED set is empty i.e., when previous_gtids are also empty) will make binlog rotate. That leaves first binary log with empty previous_gtids and second binary log's previous_gtids with the value of gtid_purged. In find_first_log_not_in_gtid_set() while we search for a binary log whose previous_gtid_set is subset of slave_gtid_executed, in this particular case, server will always find the first binary log with empty previous_gtids which is subset of any given slave_gtid_executed. Thus Master thinks that it found the first binary log which is actually not correct and unable to catch this error situation. Hence adding an extra if condition to check the situation. Slave should know about Master's purged GTIDs. If Slave's GTID executed + retrieved set does not contain Master's complete purged GTID list, that means Slave is requesting(expecting) GTIDs which were purged by Master. We should let Slave know about the situation. i.e., throw error if slave's GTID executed set is not a superset of Master's purged GTID set. The other case, where user deleted binary logs manually (with out using 'PURGE BINARY LOGS' command) but gtid_purged is not set by the user, the following if condition cannot catch it. But that is not problem because in find_first_log_not_in_gtid_set() while checking for subset previous_gtids binary log, the logic will not find one and an error ER_MASTER_HAS_PURGED_REQUIRED_GTIDS is thrown from there. Now, after the patch, Slave's I/O thread, when connecting to Master in this situation, will be aborted with an error "Master has purged binary logs containing GTIDs that the slave requires."