Bug #83526 Purge Binary Logs freezes database when there are a large number of logs
Submitted: 25 Oct 2016 12:53 Modified: 2 Nov 2016 8:04
Reporter: Tyfanie Wineriter Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.6.23 OS:Red Hat
Assigned to: CPU Architecture:Any

[25 Oct 2016 12:53] Tyfanie Wineriter
Description:
When purging a very large number of binlogs, either through a manual "purge binlog" or by changing the expire_binlog, the database stops turning over new binlogs, causing the database to be in a "frozen" state, accumulating sessions but not committing transactions to release them.

How to repeat:
create a database with 100k binlog size, keeping binlogs for 14 days, and have ~256M of redo an hour.

Now change the expire_logs_days from 14 days to 7.  (alternatively, manu lly purge more than a days worth of binlogs).

Note that the last binlog created before starting the purge is the last binlog created at all.  The database accumulates sessions and does not turn over to a new binlog... during the purge, as well as afterwards.  As far as I can tell, the database never recovers to begin turning over binlogs again unless you shutdown the database (with a kill-9 because the database won't shut down when it's trying to flush the sessions out)

Suggested fix:
figure out how to allow the binlogs to continue rotating even when deleting them?
[26 Oct 2016 1:40] zhai weixiang
duplicate of bug#71901 ?
[26 Oct 2016 14:50] Tyfanie Wineriter
hi Zhai,

It does look like it's a duplicate of that bug.  Do you know if there is a workaround?  I don't see one on that bug page.
[27 Oct 2016 7:15] zhai weixiang
It's not a big problem for us, because we always  purge binlog files one by one. And configuring a bigger binlog file size (it's  500MB on our production)  also help reducing the probability of rotating while purging expire logs.
[2 Nov 2016 8:04] MySQL Verification Team
Thank you for the report.
As Zhai rightly pointed in his initial note, this is most likely duplicate of Bug #71901. 

Thanks,
Umesh