MySQL Bugs: #70048: GTID replication fails with master has purged binary logs, but this one exists

Bug #70048	GTID replication fails with master has purged binary logs, but this one exists
Submitted:	15 Aug 2013 13:09	Modified:	21 Jul 2014 12:12
Reporter:	martin fuxa	Email Updates:
Status:	Duplicate	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S2 (Serious)
Version:	5.6.13	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
after update master 5.6.12 to 5.6.13, replication (Slave_IO_Running) on slave fails.
Slave is 5.6.13.

master
2013-08-13 23:16:37 22151 [Note] Shutting down slave threads
2013-08-13 23:16:39 22151 [Note] Binlog end
2013-08-13 23:16:43 22151 [Note] InnoDB: Shutdown completed; log sequence number 159458861125
...
2013-08-13 23:17:02 30572 [Note] InnoDB: 5.6.13 started; log sequence number 159458861125
Version: '5.6.13-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

slave
2013-08-13 23:16:37 5874 [Note] Slave: received end packet from server, apparent master shutdown:
2013-08-13 23:16:37 5874 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.000023' at position 294663771
2013-08-13 23:16:37 5874 [ERROR] Slave I/O: error reconnecting to master 'repl_isp_pr@192.168.7.107:3306' - retry-time: 60  retries: 1, Error_code: 2003
2013-08-13 23:17:37 5874 [Note] Slave: connected to master 'repl_isp_pr@192.168.7.107:3306',replication resumed in log 'mysql-bin.000023' at position 294663771
2013-08-13 23:17:39 5874 [ERROR] Error reading packet from server: The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires. ( server_errno=1236)
2013-08-13 23:17:39 5874 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.', Error_code: 1236
2013-08-13 23:17:39 5874 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000023', position 294663771
2013-08-13 23:19:10 5874 [Note] Error reading relay log event: slave SQL thread was killed

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: 192.168.7.107
                  Master_User: xxx
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File:
          Read_Master_Log_Pos: 4
               Relay_Log_File: h7-relay-bin.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File:
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table: piwik.piwik_log_visit,piwik.piwik_log_link_visit_action
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 0
              Relay_Log_Space: 151
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 72
                  Master_UUID: 1be2af64-ce18-11e2-bd6e-00151743f04a
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp: 130813 23:28:32
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set: 1be2af64-ce18-11e2-bd6e-00151743f04a:1-17995684
                Auto_Position: 1
1 row in set (0.00 sec)

but bin log at master exists
ll /var/lib/mysql/mysql-bin.000023
-rw-r----- 1 mysql mysql 294664590 Aug 13 23:16 /var/lib/mysql/mysql-bin.000023

what's wrong?

maybe looks like #67202, but this one was fixed in 5.6.9.

How to repeat:
don't know, it's production system - I haven't place for testing.

mysql 5.6.15 GTID replication 'Got fatal error 1236 from master when reading data from binary log'

I Checked SHOW SLAVE STATUS error is "'Got fatal error 1236 from master when reading data from binary log: ''The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.''

i am suspecting it happens because of master old binary log (7 days before) deleted form master as we are using expire_logs_days=7 option and slave still looking for this old binary log for some GTID reference to replicate with master.

I tried following link solution ,

http://www.mysqlperformanceblog.com/2013/02/08/how-to-createrestore-a-slave-using-gtid-rep...

but it still not working.

Note :using mysql 5.6.15

Now only option for me to re-create slave by taking mysqldump from master.

Is there any other option if we can start replication without recreating new slave.

Thanks in Advance

I get a similar error after cloning a 5.6.15 slave, upgrading to mysql 5.7.3 rpms and running mysql_upgrade.

2014-02-01T10:39:29.283069Z 1 [ERROR] Error reading packet from server: The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires. (server_errno=1236)
2014-02-01T10:39:29.283099Z 1 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.', Error_code: 1236
2014-02-01T10:39:29.283110Z 1 [Note] Slave I/O thread exiting, read up to log 'binlog.000802', position 651090510

The binlogs on the master are there:

[root@mymaster log]# ls -la binlog*
-rw-rw---- 1 mysql mysql 1073742379 Jan 29 11:03 binlog.000678
-rw-rw---- 1 mysql mysql 1073742496 Jan 29 11:33 binlog.000679
...
-rw-rw---- 1 mysql mysql 1073744432 Feb  1 09:06 binlog.000801
-rw-rw---- 1 mysql mysql 1073741935 Feb  1 10:01 binlog.000802
-rw-rw---- 1 mysql mysql 1073742502 Feb  1 10:55 binlog.000803
-rw-rw---- 1 mysql mysql  887088693 Feb  1 11:41 binlog.000804
-rw-rw---- 1 mysql mysql       2667 Feb  1 10:55 binlog.index
[root@mymaster log]#

So seems related.

Checking gtid info I see on the master:

root@mymaster [(none)]>  show global variables like 'gtid%';
+---------------+---------------------------------------------------+
| Variable_name | Value                                             |
+---------------+---------------------------------------------------+
| gtid_executed | 6a512bbf-6965-11e3-b220-b499bab5f9ba:1-1388084839 |
| gtid_mode     | ON                                                |
| gtid_owned    |                                                   |
| gtid_purged   | 6a512bbf-6965-11e3-b220-b499bab5f9ba:1-1154297832 |
+---------------+---------------------------------------------------+
4 rows in set (0.00 sec)

The 5.6 slave, which I cloned from works and says:

root@my-5.6-slave [(none)]> show global variables like 'gtid%';
+---------------+---------------------------------------------------------------------------------------------------+
| Variable_name | Value                                                                                             |
+---------------+---------------------------------------------------------------------------------------------------+
| gtid_executed | 6a512bbf-6965-11e3-b220-b499bab5f9ba:1-1384939362,
f3408d4b-6979-11e3-b2a6-68b5997180f0:1-2234136 |
| gtid_mode     | ON                                                                                                |
| gtid_owned    | 6a512bbf-6965-11e3-b220-b499bab5f9ba:1384939363#3                                                 |
| gtid_purged   | 6a512bbf-6965-11e3-b220-b499bab5f9ba:1-1112874654,
f3408d4b-6979-11e3-b2a6-68b5997180f0:1-1994205 |
+---------------+---------------------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)

The 5.7.3 slave has:

root@my-5.7.3-slave [(none)]> show global variables like 'gtid%';
+---------------+--------------------------------------------------------------------------------------------------------+
| Variable_name | Value                                                                                                  |
+---------------+--------------------------------------------------------------------------------------------------------+
| gtid_executed | 34778193-8b2c-11e3-8e61-e839352c66d2:1-774,
6a512bbf-6965-11e3-b220-b499bab5f9ba:1381936142-1381936213 |
| gtid_mode     | ON                                                                                                     |
| gtid_owned    |                                                                                                        |
| gtid_purged   |                                                                                                        |
+---------------+--------------------------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)

It seems to be missing gtid_owned/purged data...
This may be due to a bug in 5.7.3. I wasn't sure whether to open a separate bug for this as the version is different and 5.7 is not yet GA. Hope this report is ok.

It seems worthwhile whatever the cause to actually provide information on the GTID info provided by the master and the GTID info known by the slave. This will at least make diagnosis more clear of what the exact problem is.

See bug#71565 for the feature request of better logging.

Thank you for the report.

This is closed as fix for bug #71376 in version 5.6.18. Please upgrade.

Hi

we are using 5.6.18-enterprise-commercial-advanced-log MySQL Enterprise Server.

And encountered exact same problem yesterday.

Slave status 

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: x.x.x.x
                  Master_User: xxxx
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: xxxxxxxx-log-bin.000125
          Read_Master_Log_Pos: 522617108
               Relay_Log_File: xxxxxxxx-relay-bin.000712
                Relay_Log_Pos: 4
        Relay_Master_Log_File: xxxxxxxx-log-bin.000125
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB: blackbox
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 522617108
              Relay_Log_Space: 753
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting TER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 5
                  Master_UUID: 6f772a80-0cdb-11e4-9c08-005056837d70
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp: 140818 09:47:14
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set: (list to long)

on master :-

[root@xxxxxxx]# ls -ltra /var/lib/mysql/xxxxx-log-bin.000125
-rw-rw----. 1 mysql mysql 1073752881 Aug 18 05:25 /var/lib/mysql/xxxxx-log-bin.000125

tried to set global gtid_purged to "25550008" and did change master still same.

Checked log-bin and it does have GTID 25550009 in it.

Hi just hit same problem again on Prod-Slave,

It was restored from mysqldump backups and

when tried to run 

CHANGE MASTER TO
MASTER_HOST='master server',
MASTER_USER ='replication user',
MASTER_PASSWORD ='password',
MASTER_AUTO_POSITION=1;

We have encountered 

 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
WE need urgent help this problem.

MySQL version is

Server version: 5.6.18-enterprise-commercial-advanced-log MySQL Enterprise Server - Advanced Edition (Commercial)

Can we increase the severity To 1 please.

Regards
Naheem

The same problem happened with me and I noticed that I was not informing the correct string on the set gtid_purged, that need to consider the master-uuid:1-N, and 1-N is the gap interval the slave will not consider when start reading transaction IDs from master.

Something like:

set global gtid_purged='01878c69-0bf8-11e6-a0d4-0050568728a3:1-100';
Query OK, 0 rows affected (0.00 sec)

This works well.