MySQL Bugs: #80889: PURGE BINARY LOGS TO is reading the whole binlog file and causing MySql to Stall

Bug #80889	PURGE BINARY LOGS TO is reading the whole binlog file and causing MySql to Stall
Submitted:	29 Mar 2016 15:31	Modified:	5 Aug 2016 10:22
Reporter:	Davy Campano	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S5 (Performance)
Version:	5.7.11	OS:	Linux (4.1.17-22.30.amzn1.x86_64)
Assigned to:		CPU Architecture:	Any

Description:
Calling "PURGE BINARY LOGS TO 'binarylog.00003'" causes the system to read the whole 'binarylog.00003' file.  

I recently upgraded from MySql 5.6.22 to MySql 5.7.10 and when the automatic purge for binary logs occurs (expire_logs_days), the system stalls for 8-12 seconds until it finishes.  I am using 1GB binary log files and I have been able to reproduce this on a fresh install.  I have tested this on the most recent version (5.7.11) and able to reproduce the issue there too.

How to repeat:
1.  Install MySql 5.7.11
2.  Run mysqlslap to generate a large binary log
     mysqlslap --host=localhost  --auto-generate-sql --verbose --concurrency=500 --iterations=100 --number-int-cols=5 --number-char-cols=20
3.  Attach strace to the mysqld process and run "PURGE BINARY LOGS TO"
4.  Observe that strace shows the whole file being read

my.cnf with tested settings

Attachment: my.cnf (application/octet-stream, text), 948 bytes.

strace log when purge is running

Attachment: purge_strace1.log (application/octet-stream, text), 3.33 KiB.

After further testing this is only occurring when GTID_MODE=OFF.  If I enable GTIDs, then the whole file is not read.  This bug is occurring on a fresh install of Amazon Linux 2015.09.  I can test further distributions if needed, but it doesn't appear to be distro related.

Post by Developer
-----------------
Hell Davy Campano, 

Thank you for the bug report. It is verified using your steps.

Regards,
Venkatesh.

Posted by developer:
 
Thank you for your feedback, this has been fixed in upcoming versions and the following was added to the 5.7.14 changelog:
When --gtid-mode=off, issuing a PURGE BINARY LOGS TO file statement causes the system to read the whole file. Depending on the size of the file this could cause the server to stall for a few seconds. This was due to the way binary log files were being searched for the Previous_gtids value. The fix adds an optimization to stop the reading of the binary log file if it is supposed only to read the lost GTIDs and a Previous_gtids event is found.