Bug #68575 need protection from concurrent binlog rotate
Submitted: 5 Mar 2013 14:49 Modified: 28 May 2013 6:22
Reporter: Mark Callaghan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.6.10 OS:Any
Assigned to: CPU Architecture:Any

[5 Mar 2013 14:49] Mark Callaghan
Description:
Replication in MySQL 5.6 rotates binlogs too frequently because it doesn't protect against concurrent rotation attempts when the binlog is full. The attempts after the first end up rotating the new binlog files which create many small ones. 

Thanks to Kristian Nielsen for explaining this. 

These are the binlog files during one run of sysbench. start/stop slave was not done. mysqld restart was not done. There are several small files.

-rw-rw---- 1 mysql mysql        143 Mar  5 06:27 udb12109-bin.000001
-rw-rw---- 1 mysql mysql 1073742737 Mar  5 06:33 udb12109-bin.000002
-rw-rw---- 1 mysql mysql       1798 Mar  5 06:33 udb12109-bin.000003
-rw-rw---- 1 mysql mysql 1073743385 Mar  5 06:37 udb12109-bin.000004
-rw-rw---- 1 root  root  1073745842 Mar  5 06:40 udb12109-bin.000005
-rw-rw---- 1 root  root        3698 Mar  5 06:40 udb12109-bin.000006
-rw-rw---- 1 root  root         170 Mar  5 06:40 udb12109-bin.000007
-rw-rw---- 1 root  root  1073746413 Mar  5 06:43 udb12109-bin.000008
-rw-rw---- 1 root  root        3697 Mar  5 06:43 udb12109-bin.000009
-rw-rw---- 1 root  root         170 Mar  5 06:43 udb12109-bin.000010
-rw-rw---- 1 root  root   956322232 Mar  5 06:45 udb12109-bin.000011

How to repeat:
Run sysbench with concurrent threads. Thread stacks for one example came from http://bugs.mysql.com/bug.php?id=68569

Suggested fix:
don't know
[5 Mar 2013 14:51] Mark Callaghan
Thread stacks from a deadlock in bug 68569 show:
* 1 thread doing rotate
* 2 threads about to do rotate once they get LOCK_log

1 thread doing rotate

Thread 76 (Thread 0x7fb1b87ff700 (LWP 7868)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x000000000081f65d in inline_mysql_cond_wait (mutex=0x1248c48, that=0x1249680) at /s/bld/orig5610/include/mysql/psi/mysql_thread.h:1162
#2  MYSQL_BIN_LOG::new_file_impl (this=0x12487c0, need_lock_log=false, extra_description_event=0x0) at /s/bld/orig5610/sql/binlog.cc:4353
#3  0x000000000081fff0 in MYSQL_BIN_LOG::rotate (this=0x12487c0, force_rotate=<optimized out>, check_purge=0x7fb1b87fbc6f) at /s/bld/orig5610/sql/binlog.cc:4878
#4  0x0000000000820e9d in MYSQL_BIN_LOG::ordered_commit (this=0x12487c0, thd=0x7fb1bbd82000, all=<optimized out>, skip_commit=<optimized out>) at /s/bld/orig5610/sql/binlog.cc:6459
#5  0x0000000000821116 in MYSQL_BIN_LOG::commit (this=0x12487c0, thd=0x7fb1bbd82000, all=false) at /s/bld/orig5610/sql/binlog.cc:5910
#6  0x000000000059591c in ha_commit_trans (thd=0x7fb1bbd82000, all=<optimized out>) at /s/bld/orig5610/sql/handler.cc:1420
#7  0x0000000000722ce7 in trans_commit_stmt (thd=0x7fb1bbd82000) at /s/bld/orig5610/sql/transaction.cc:373
#8  0x00000000006a81fc in mysql_execute_command (thd=0x7fb1bbd82000) at /s/bld/orig5610/sql/sql_parse.cc:4932
#9  0x00000000006ae7c8 in mysql_parse (parser_state=<optimized out>, thd=0x7fb1bbd82000, rawbuf=<optimized out>, length=<optimized out>) at /s/bld/orig5610/sql/sql_parse.cc:6149
#10 mysql_parse (thd=0x7fb1bbd82000, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /s/bld/orig5610/sql/sql_parse.cc:6022
#11 0x00000000006affa8 in dispatch_command (command=COM_QUERY, thd=0x7fb1bbd82000, packet=0x7fb1b87fe350 "", packet_length=3246800912) at /s/bld/orig5610/sql/sql_parse.cc:1334
#12 0x000000000068273f in do_handle_one_connection (thd_arg=<optimized out>) at /s/bld/orig5610/sql/sql_connect.cc:969
#13 0x00000000006827a9 in handle_one_connection (arg=<optimized out>) at /s/bld/orig5610/sql/sql_connect.cc:885
#14 0x00007fc2fa59bf81 in start_thread (arg=0x7fb1b87ff700) at pthread_create.c:301
#15 0x00007fc2f92db85d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

2 threads about to do a rotate

Thread 75 (Thread 0x7fb1b93be700 (LWP 7869)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fc2fa59e3d1 in _L_lock_885 () from /usr/local/fbcode/gcc-4.6.2-glibc-2.13/lib/libpthread.so.0
#2  0x00007fc2fa59e16c in __pthread_mutex_lock (mutex=0x12487c8) at pthread_mutex_lock.c:61
#3  0x0000000000820e8c in inline_mysql_mutex_lock (that=0x12487c8) at /s/bld/orig5610/include/mysql/psi/mysql_thread.h:688
#4  MYSQL_BIN_LOG::ordered_commit (this=0x12487c0, thd=0x7fb1bbd8b000, all=<optimized out>, skip_commit=<optimized out>) at /s/bld/orig5610/sql/binlog.cc:6458
#5  0x0000000000821116 in MYSQL_BIN_LOG::commit (this=0x12487c0, thd=0x7fb1bbd8b000, all=false) at /s/bld/orig5610/sql/binlog.cc:5910
#6  0x000000000059591c in ha_commit_trans (thd=0x7fb1bbd8b000, all=<optimized out>) at /s/bld/orig5610/sql/handler.cc:1420
#7  0x0000000000722ce7 in trans_commit_stmt (thd=0x7fb1bbd8b000) at /s/bld/orig5610/sql/transaction.cc:373
#8  0x00000000006a81fc in mysql_execute_command (thd=0x7fb1bbd8b000) at /s/bld/orig5610/sql/sql_parse.cc:4932
#9  0x00000000006ae7c8 in mysql_parse (parser_state=<optimized out>, thd=0x7fb1bbd8b000, rawbuf=<optimized out>, length=<optimized out>) at /s/bld/orig5610/sql/sql_parse.cc:6149
#10 mysql_parse (thd=0x7fb1bbd8b000, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /s/bld/orig5610/sql/sql_parse.cc:6022
#11 0x00000000006affa8 in dispatch_command (command=COM_QUERY, thd=0x7fb1bbd8b000, packet=0x7fb1b93bd350 "", packet_length=3263250448) at /s/bld/orig5610/sql/sql_parse.cc:1334
#12 0x000000000068273f in do_handle_one_connection (thd_arg=<optimized out>) at /s/bld/orig5610/sql/sql_connect.cc:969
#13 0x00000000006827a9 in handle_one_connection (arg=<optimized out>) at /s/bld/orig5610/sql/sql_connect.cc:885
#14 0x00007fc2fa59bf81 in start_thread (arg=0x7fb1b93be700) at pthread_create.c:301
#15 0x00007fc2f92db85d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
[5 Mar 2013 15:04] Mark Callaghan
More examples of small binlog files

-rw-rw---- 1 mysql mysql        143 Mar  5 06:27 udb12109-bin.000001
-rw-rw---- 1 mysql mysql 1073742737 Mar  5 06:33 udb12109-bin.000002
-rw-rw---- 1 mysql mysql       1798 Mar  5 06:33 udb12109-bin.000003
-rw-rw---- 1 mysql mysql 1073743385 Mar  5 06:37 udb12109-bin.000004
-rw-rw---- 1 mysql mysql 1073745842 Mar  5 06:40 udb12109-bin.000005
-rw-rw---- 1 mysql mysql       3698 Mar  5 06:40 udb12109-bin.000006
-rw-rw---- 1 mysql mysql        170 Mar  5 06:40 udb12109-bin.000007
-rw-rw---- 1 mysql mysql 1073746413 Mar  5 06:43 udb12109-bin.000008
-rw-rw---- 1 mysql mysql       3697 Mar  5 06:43 udb12109-bin.000009
-rw-rw---- 1 mysql mysql        170 Mar  5 06:43 udb12109-bin.000010
-rw-rw---- 1 mysql mysql 1073746180 Mar  5 06:45 udb12109-bin.000011
-rw-rw---- 1 mysql mysql        170 Mar  5 06:45 udb12109-bin.000012
-rw-rw---- 1 mysql mysql        170 Mar  5 06:45 udb12109-bin.000013
-rw-rw---- 1 mysql mysql 1073744784 Mar  5 06:48 udb12109-bin.000014
-rw-rw---- 1 mysql mysql 1073745668 Mar  5 06:50 udb12109-bin.000015
-rw-rw---- 1 mysql mysql       8033 Mar  5 06:50 udb12109-bin.000016
-rw-rw---- 1 mysql mysql        170 Mar  5 06:50 udb12109-bin.000017
-rw-rw---- 1 mysql mysql 1073753039 Mar  5 06:52 udb12109-bin.000018
-rw-rw---- 1 mysql mysql       8305 Mar  5 06:52 udb12109-bin.000019
-rw-rw---- 1 mysql mysql 1073748406 Mar  5 06:53 udb12109-bin.000020
-rw-rw---- 1 mysql mysql       8041 Mar  5 06:53 udb12109-bin.000021
-rw-rw---- 1 mysql mysql        170 Mar  5 06:53 udb12109-bin.000022
-rw-rw---- 1 mysql mysql 1073746366 Mar  5 06:55 udb12109-bin.000023
-rw-rw---- 1 mysql mysql       8040 Mar  5 06:55 udb12109-bin.000024
-rw-rw---- 1 mysql mysql        170 Mar  5 06:55 udb12109-bin.000025
-rw-rw---- 1 mysql mysql 1073744581 Mar  5 06:57 udb12109-bin.000026
-rw-rw---- 1 root  root        8311 Mar  5 06:57 udb12109-bin.000027
-rw-rw---- 1 root  root  1073746617 Mar  5 06:59 udb12109-bin.000028
-rw-rw---- 1 root  root        7763 Mar  5 06:59 udb12109-bin.000029
-rw-rw---- 1 root  root         170 Mar  5 06:59 udb12109-bin.000030
-rw-rw---- 1 root  root         170 Mar  5 06:59 udb12109-bin.000031
-rw-rw---- 1 root  root  1073757276 Mar  5 07:00 udb12109-bin.000032
-rw-rw---- 1 root  root         170 Mar  5 07:00 udb12109-bin.000033
-rw-rw---- 1 root  root  1073747446 Mar  5 07:02 udb12109-bin.000034
-rw-rw---- 1 root  root         170 Mar  5 07:02 udb12109-bin.000035
-rw-rw---- 1 root  root   842400425 Mar  5 07:03 udb12109-bin.000036
[5 Mar 2013 22:03] Sveta Smirnova
Thank you for the report.

Do you have options such as max_binlog_size or expire_log_days?
[6 Mar 2013 0:11] Mark Callaghan
I think the code is the problem. This is one example of the my.cnf that I used. I have get this  with both my IO-bound & cached update-only workloads produced by sysbench described at these links. This is easy to reproduce given sufficient concurrency.

http://mysqlha.blogspot.com/2013/02/mysql-56-io-bound-update-only-workloads.html
http://mysqlha.blogspot.com/2013/03/mysql-56-cached-update-only-workload.html

sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES 
innodb_log_file_size=1900M
innodb_max_dirty_pages_pct=80
innodb_file_format=barracuda
innodb_file_per_table
table-definition-cache=1000
table-open-cache=2000
max_connections=2000
key_buffer_size=200M
innodb_io_capacity=1000
innodb_flush_log_at_trx_commit=2
innodb_doublewrite=0
query_cache_size=0
query_cache_type=0
innodb_thread_concurrency=0
metadata_locks_hash_instances=256
innodb_checksum_algorithm=CRC32
log_bin
sync_binlog=1
innodb_flush_method=O_DIRECT_NO_FSYNC
query_cache_size=0
query_cache_type=0
innodb_thread_concurrency=32
innodb_buffer_pool_size=64g
innodb_io_capacity=8192
innodb_buffer_pool_instances=8
adaptive_hash_index=1
table_open_cache_instances=1
innodb_lru_scan_depth=8192
binlog_max_flush_queue_time=0
[6 Mar 2013 9:39] Shane Bester
Easy to repeat.  

Server startup:
------------------
mysqld --no-defaults --max-binlog-size=10M --sync-binlog=0 --skip-gr --skip-na --server-id=1  --innodb-flush-log-at-trx-commit=0 --innodb-support-xa=0 --innodb-checksums=0  --console --log-bin --binlog-format=statement

Test init:
---------
drop table if exists t1
create table t1(id int primary key,a char(255))engine=innodb
set global innodb_flush_log_at_trx_commit=0
set global innodb_lock_wait_timeout=1

In 20 threads:
--------------
insert ignore into t1 values(?,'?');  #replace with fake data

result:
----------
    10 487 529 test-bin.000001
        22 796 test-bin.000002
    10 517 383 test-bin.000003
        26 659 test-bin.000004
    10 521 964 test-bin.000005
           166 test-bin.000006
    10 489 948 test-bin.000007
    10 486 269 test-bin.000008
    10 487 351 test-bin.000009
    10 510 350 test-bin.000010
           166 test-bin.000011
    10 590 988 test-bin.000012
           166 test-bin.000013
    10 488 141 test-bin.000014
    10 491 228 test-bin.000015
    10 486 661 test-bin.000016
    10 510 749 test-bin.000017
           166 test-bin.000018
    10 487 192 test-bin.000019
    10 493 630 test-bin.000020
    10 485 826 test-bin.000021
    10 502 344 test-bin.000022
        49 300 test-bin.000023
    10 519 494 test-bin.000024
           166 test-bin.000025
    10 551 394 test-bin.000026
[6 Mar 2013 9:54] Shane Bester
testcase. you need a fast server. place binlogs/datadir on ramdisk if needed.

Attachment: bug68575.c (text/plain), 7.32 KiB.

[27 Mar 2013 4:08] Venkatesh Duggirala
patch to solve the issue

Attachment: patch.diff (text/x-patch), 471 bytes.

[27 Mar 2013 4:14] Venkatesh Duggirala
Hello Mark, 
Thank you for reporting the issue.
I have attached above a patch which will solve the issue. Could you please check whether it solves your problem or not? Please let us know incase if it does not.
[28 May 2013 6:22] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html
[28 May 2013 6:23] Jon Stephens
Fixed in 5.6+. Documented in the 5.6.12 and 5.7.2 changelogs as follows:

        Extra binary log rotations were performed due to concurrent
        attempts at rotation when the binary log became full, which were
        allowed to succeed. This could lead to the unnecessary creation
        of many small binary log files.

Closed.