Bug #71901 rotate + purge binlog lead to stall
Submitted: 3 Mar 2014 2:12 Modified: 21 Apr 2014 11:23
Reporter: zhai weixiang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Utilities: Binlog Events Severity:S3 (Non-critical)
Version:5.6.16, 5.6.17 OS:Any
Assigned to: CPU Architecture:Any

[3 Mar 2014 2:12] zhai weixiang
Description:
related backtrace:

      1 unlink(libc.so.6),my_delete(my_delete.c:26),inline_mysql_file_delete(mysql_file.h:1281),MYSQL_BIN_LOG::purge_index_entry(mysql_file.h:1281),MYSQL_BIN_LOG::purge_logs(binlog.cc:4194),purge_master_logs(binlog.cc:2063),mysql_execute_command(sql_parse.cc:2682),mysql_parse(sql_parse.cc:6235),dispatch_command(sql_parse.cc:1334),do_handle_one_connection(sql_connect.cc:982),handle_one_connection(sql_connect.cc:898),start_thread(libpthread.so.0),clone(libc.so.6)

----- this thread holds the LOCK_index until the purge operation finished

      1 __lll_lock_wait(libpthread.so.0),_L_lock_854(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),inline_mysql_mutex_lock(mysql_thread.h:690),MYSQL_BIN_LOG::new_file_impl(binlog.cc:4737),new_file_without_locking(binlog.cc:4691),MYSQL_BIN_LOG::rotate(binlog.cc:4691),MYSQL_BIN_LOG::ordered_commit(binlog.cc:6930),MYSQL_BIN_LOG::commit(binlog.cc:6320),ha_commit_trans(handler.cc:1435),trans_commit_stmt(transaction.cc:434),mysql_execute_command(sql_parse.cc:4997),mysql_parse(sql_parse.cc:6235),dispatch_command(sql_parse.cc:1334),do_handle_one_connection(sql_connect.cc:982),handle_one_connection(sql_connect.cc:898),start_thread(libpthread.so.0),clone(libc.so.6)

----waiting for LOCK_index

How to repeat:
rotate + purge binary logs 

Suggested fix:

The main logic of function MYSQL_BIN_LOG::purge_logs:
1. Lock  LOCK_index
2. scan the index file and store binary files that need to be purged in purge_index_file
3. update the index file 
4. delete binary logs
5. unlock LOCK_index

I think we can exchange step 4 and step5 ,  and introduce  another lock to only protect the purge_index_file (or just create an independent IO_CACHE for each purge operation ? )
[21 Apr 2014 11:23] Umesh Shastry
Hello Weixiang,

Thank you for the bug report.
Verified as described.
I'm able to reproduce this only 1/15 times with large binary logs.

Thanks,
Umesh
[2 Nov 2016 8:04] Umesh Shastry
Bug #83526 marked as duplicate of this one.