MySQL Bugs: #50038: Deadlock on flush logs with concurrent DML and RBR

Bug #50038	Deadlock on flush logs with concurrent DML and RBR
Submitted:	4 Jan 2010 2:43	Modified:	7 Mar 2010 23:58
Reporter:	Elena Stepanova	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Row Based Replication ( RBR )	Severity:	S2 (Serious)
Version:	mysql-5.1-rep+3	OS:	Any
Assigned to:	Alfranio Tavares Correia Junior	CPU Architecture:	Any

Description:
Server deadlocks when FLUSH LOGS is executed concurrently with DML flow.

Stack trace for some threads (full output is attached):

Thread 4 (process 22634):
#0  0x00000033a8c0c888 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0
#1  0x00000033a8c088a5 in _L_mutex_lock_107 () from /lib64/libpthread.so.0
#2  0x00000033a8c08333 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x000000000069e002 in MYSQL_BIN_LOG::write (this=0xea16a0, thd=0x0, cache=0x33a8c0f838, commit_event=0xffffffffffffffff,
    incident=false) at log.cc:4839
#4  0x000000000069e34a in binlog_flush_stmt_cache (thd=0x15661bf0, cache_mngr=<value optimized out>) at log.cc:1634
#5  0x000000000069e8e9 in binlog_commit (hton=<value optimized out>, thd=0x15661bf0, all=false) at log.cc:1682
#6  0x00000000006dfb62 in ha_commit_one_phase (thd=0x15661bf0, all=false) at handler.cc:1249
#7  0x00000000006dfdbe in ha_commit_trans (thd=0x15661bf0, all=false) at handler.cc:1207
#8  0x00000000006e0028 in ha_autocommit_or_rollback (thd=0x15661bf0, error=0) at handler.cc:1384
#9  0x000000000064e257 in close_thread_tables (thd=0x15661bf0) at sql_base.cc:1287
#10 0x000000000073d28a in sp_lex_keeper::reset_lex_and_exec_core (this=0x1569fd70, thd=0x15661bf0, nextp=0x452f11a8,
    open_tables=<value optimized out>, instr=0x1569fd30) at sp_head.cc:2740
#11 0x00000000007425f8 in sp_instr_stmt::execute (this=0x1569fd30, thd=0x15661bf0, nextp=0x452f11a8) at sp_head.cc:2847
#12 0x0000000000740b01 in sp_head::execute (this=0x1509aaa0, thd=0x15661bf0) at sp_head.cc:1253
#13 0x0000000000741261 in sp_head::execute_procedure (this=0x1509aaa0, thd=0x15661bf0, args=0x15663ef0) at sp_head.cc:1983
#14 0x0000000000614361 in mysql_execute_command (thd=0x15661bf0) at sql_parse.cc:4233
#15 0x0000000000618b6a in mysql_parse (thd=0x15661bf0, inBuf=0x1564a3b0 "CALL ins_tb7_eng1(8,'yVuZ')", length=27,
    found_semicolon=0x452f2e50) at sql_parse.cc:5813
#16 0x000000000061924b in dispatch_command (command=COM_QUERY, thd=0x15661bf0, packet=<value optimized out>,
    packet_length=<value optimized out>) at sql_parse.cc:1130
#17 0x0000000000619e54 in do_command (thd=0x15661bf0) at sql_parse.cc:805
#18 0x000000000060d6f7 in handle_one_connection (arg=0x15661bf0) at sql_connect.cc:1127
#19 0x00000033a8c062f7 in start_thread () from /lib64/libpthread.so.0
#20 0x00000033a80d1b6d in clone () from /lib64/libc.so.6

Thread 3 (process 22635):
#0  0x00000033a8c0a496 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000069d5fb in MYSQL_BIN_LOG::new_file_impl (this=0xea16a0, need_lock=false) at log.cc:3790
#2  0x000000000069d7ac in MYSQL_BIN_LOG::rotate_and_purge (this=0xea16a0, flags=2) at log.cc:4594
#3  0x000000000060ea54 in reload_acl_and_cache (thd=0x1564c3c0, options=16130, tables=0x0, write_to_binlog=0x4533314f)
    at sql_parse.cc:6701
#4  0x00000000006136c7 in mysql_execute_command (thd=0x1564c3c0) at sql_parse.cc:3823
#5  0x0000000000618b6a in mysql_parse (thd=0x1564c3c0, inBuf=0x15656cc0 "flush logs", length=10, found_semicolon=0x45333e50)
    at sql_parse.cc:5813
#6  0x000000000061924b in dispatch_command (command=COM_QUERY, thd=0x1564c3c0, packet=<value optimized out>,
    packet_length=<value optimized out>) at sql_parse.cc:1130
#7  0x0000000000619e54 in do_command (thd=0x1564c3c0) at sql_parse.cc:805
#8  0x000000000060d6f7 in handle_one_connection (arg=0x1564c3c0) at sql_connect.cc:1127
#9  0x00000033a8c062f7 in start_thread () from /lib64/libpthread.so.0
#10 0x00000033a80d1b6d in clone () from /lib64/libc.so.6

Thread 2 (process 22640):
#0  0x00000033a8c0c888 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0
#1  0x00000033a8c088a5 in _L_mutex_lock_107 () from /lib64/libpthread.so.0
#2  0x00000033a8c08333 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00000000006b9b23 in Log_event::read_log_event (file=0x45374100, packet=0x15659150, log_lock=0xea16a8) at log_event.cc:978
#4  0x000000000072492d in mysql_binlog_send (thd=0x15658cd0, log_ident=<value optimized out>, pos=4, flags=0) at sql_repl.cc:663
#5  0x000000000061977e in dispatch_command (command=COM_BINLOG_DUMP, thd=0x15658cd0, packet=<value optimized out>,
    packet_length=<value optimized out>) at sql_parse.cc:1322
#6  0x0000000000619e54 in do_command (thd=0x15658cd0) at sql_parse.cc:805
#7  0x000000000060d6f7 in handle_one_connection (arg=0x15658cd0) at sql_connect.cc:1127
#8  0x00000033a8c062f7 in start_thread () from /lib64/libpthread.so.0
#9  0x00000033a80d1b6d in clone () from /lib64/libc.so.6

The problem was initially observed while checking fix for bug#49186 and mentioned there in the comment of 5 Dec 2009.

Could not reproduce with 5.1.42 or 5.5.0.

How to repeat:
Test case and data file are attached.
Put tb1.txt into <mysql-basedir>/mysql-test folder, test file into a test folder, and run the test as

perl ./mysql-test-run.pl --mysqld=--innodb --mysqld=--binlog_format=row --testcase-timeout=3 rpl_deadlock 

If there is no deadlock, the test takes roughly about a minute, and fails with simple mismatch. If the deadlock occurs, the test aborts upon timeout.

If you don't observe the problem at once, try to increase $attempts value at the top of the test file.

If you are running the test on a slow system, you might want to increase testcase-timeout to avoid false negatives.

Additional monitoring:

The test performs a loop of <$attempts> steps doing modest INSERTs and concurrent FLUSH LOGS calls, which means that the process list should not show any hanging DML statements, and new binary logs should appear in the data folder in rapid fashion. If creation of new binary logs stops before <$attempts> number of logs is reached, it most likely indicates that the problem has been hit. For some time it is still possible to connect to the server and see the process list, which shows the same DML statement with increasing 'Time' value, and 'Finished reading one binlog; switching to next binlog' for the 'Binlog Dump' thread. After a while, SHOW PROCESSLIST command starts hanging, as well as new connections to the server.

Test file

Attachment: rpl_deadlock.test (application/octet-stream, text), 1.94 KiB.

Data file to put into <mysql-basedir>/mysql-test

Attachment: tb1.txt (text/plain), 186.96 KiB.

All threads bt

Attachment: all_threads.out (application/octet-stream, text), 10.52 KiB.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/95917

3130 Alfranio Correia	2010-01-05
      BUG#50038 Deadlock on flush logs with concurrent DML and RBR
      
      In auto-commit mode, updating both trx and non-trx tables (i.e. issuing a mixed statement)
      causes the following sequence of events:
      
      1 - Flush trx changes (MYSQL_BIN_LOG::write):
        1.1 - mutex_lock (&LOCK_log)
        1.2 - mutex_lock (&LOCK_prep_xids)
        1.3 - increase prepared_xids
        1.4 - mutex_unlock (&LOCK_prep_xids)
        1.5 - mutex_unlock (&LOCK_log)
      
      2 - Flush non-trx changes (MYSQL_BIN_LOG::write):
        2.1 - mutex_lock (&LOCK_log)
        2.2 - mutex_unlock (&LOCK_log)
      
      If a FLUSH logs happens between the two flushes, a deadlock may arise due to the following
      sequence of events while processing the command:
      
      3 - FLUSH logs command (MYSQL_BIN_LOG::new_file_impl):
        3.1 - mutex_lock (&LOCK_log)
        3.2 - mutex_lock (&LOCK_prep_xids)
        3.3 - while (prepared_xids)  pthread_cond_wait(..., &LOCK_prep_xids);
        3.4 - mutex_unlock (&LOCK_prep_xids)
        3.5 - mutex_unlock (&LOCK_log)
      
      The execution will pass the while (item 3.3), when the prepared_xids is decreased what
      will happen after the non-trx changes are flushed. However, the transaction cannot
      progress because the mutex LOCK_log is grabbed on behalf of the FLUSH command.
      
      To fix the problem, we ensure that the non-trx changes are always flushed before the
      trx changes.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/95954

3130 Alfranio Correia	2010-01-05
      BUG#50038 Deadlock on flush logs with concurrent DML and RBR
      
      In auto-commit mode, updating both trx and non-trx tables (i.e. issuing a mixed
      statement) causes the following sequence of events:
      
      1 - "Flush trx changes" (MYSQL_BIN_LOG::write) - T1:
        1.1 - mutex_lock (&LOCK_log)
        1.2 - mutex_lock (&LOCK_prep_xids)
        1.3 - increase prepared_xids
        1.4 - mutex_unlock (&LOCK_prep_xids)
        1.5 - mutex_unlock (&LOCK_log)
      
      2 - "Flush non-trx changes" (MYSQL_BIN_LOG::write) - T1:
        2.1 - mutex_lock (&LOCK_log)
        2.2 - mutex_unlock (&LOCK_log)
      
      3. "unlog" - T1
        3.1 - mutex_lock (&LOCK_prep_xids)
        3.2 - decrease prepared xids
        3.3 - pthread_cond_signal(&COND_prep_xids);
        3.4 - mutex_unlock (&LOCK_prep_xids)
      
      The "FLUSH logs" command produces the following sequence of events:
      
      1 - "FLUSH logs" command (MYSQL_BIN_LOG::new_file_impl) - user thread:
        1.1 - mutex_lock (&LOCK_log)
        1.2 - mutex_lock (&LOCK_prep_xids)
        1.3 - while (prepared_xids)  pthread_cond_wait(..., &LOCK_prep_xids);
        1.4 - mutex_unlock (&LOCK_prep_xids)
        1.5 - mutex_unlock (&LOCK_log)
      
      A deadlock will arise if T1 flushes the trx changes and thus increases
      prepared_xids but before it is able to continue the execution and flush the
      non-trx changes, an user thread calls the "FLUSH logs" command and wait that
      the prepared_xids is decreased and gets to zero. However, T1 cannot proceed
      with the call to "Flush non-trx changes" because it will block in the mutex
      "LOCK_log" and by consequence cannot complete the execution and call the
      unlog to decrease the prepared_xids.
      
      To fix the problem, we ensure that the non-trx changes are always flushed
      before the trx changes.
      
      Note that if you call "Flush non-trx changes" and a concurrent "FLUSH logs" is
      issued, the "Flush non-trx changes" may block, but a deadlock will never happen
      because the prepared_xids will eventually get to zero. Bottom line, there will
      not be any transaction able to increase the prepared_xids because they will
      block in the mutex "LOCK_log" (MYSQL_BIN_LOG::write) and those that increased
      the prepared_xids will eventually commit and decrease the prepared_xids.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/96003

3130 Alfranio Correia	2010-01-05
      BUG#50038 Deadlock on flush logs with concurrent DML and RBR
      
      In auto-commit mode, updating both trx and non-trx tables (i.e. issuing a mixed
      statement) causes the following sequence of events:
      
      1 - "Flush trx changes" (MYSQL_BIN_LOG::write) - T1:
        1.1 - mutex_lock (&LOCK_log)
        1.2 - mutex_lock (&LOCK_prep_xids)
        1.3 - increase prepared_xids
        1.4 - mutex_unlock (&LOCK_prep_xids)
        1.5 - mutex_unlock (&LOCK_log)
      
      2 - "Flush non-trx changes" (MYSQL_BIN_LOG::write) - T1:
        2.1 - mutex_lock (&LOCK_log)
        2.2 - mutex_unlock (&LOCK_log)
      
      3. "unlog" - T1
        3.1 - mutex_lock (&LOCK_prep_xids)
        3.2 - decrease prepared xids
        3.3 - pthread_cond_signal(&COND_prep_xids);
        3.4 - mutex_unlock (&LOCK_prep_xids)
      
      The "FLUSH logs" command produces the following sequence of events:
      
      1 - "FLUSH logs" command (MYSQL_BIN_LOG::new_file_impl) - user thread:
        1.1 - mutex_lock (&LOCK_log)
        1.2 - mutex_lock (&LOCK_prep_xids)
        1.3 - while (prepared_xids)  pthread_cond_wait(..., &LOCK_prep_xids);
        1.4 - mutex_unlock (&LOCK_prep_xids)
        1.5 - mutex_unlock (&LOCK_log)
      
      A deadlock will arise if T1 flushes the trx changes and thus increases
      prepared_xids but before it is able to continue the execution and flush the
      non-trx changes, an user thread calls the "FLUSH logs" command and wait that
      the prepared_xids is decreased and gets to zero. However, T1 cannot proceed
      with the call to "Flush non-trx changes" because it will block in the mutex
      "LOCK_log" and by consequence cannot complete the execution and call the
      unlog to decrease the prepared_xids.
      
      To fix the problem, we ensure that the non-trx changes are always flushed
      before the trx changes.
      
      Note that if you call "Flush non-trx changes" and a concurrent "FLUSH logs" is
      issued, the "Flush non-trx changes" may block, but a deadlock will never happen
      because the prepared_xids will eventually get to zero. Bottom line, there will
      not be any transaction able to increase the prepared_xids because they will
      block in the mutex "LOCK_log" (MYSQL_BIN_LOG::write) and those that increased
      the prepared_xids will eventually commit and decrease the prepared_xids.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/96007

3130 Alfranio Correia	2010-01-05
      BUG#50038 Deadlock on flush logs with concurrent DML and RBR
      
      In auto-commit mode, updating both trx and non-trx tables (i.e. issuing a mixed
      statement) causes the following sequence of events:
      
      1 - "Flush trx changes" (MYSQL_BIN_LOG::write) - T1:
        1.1 - mutex_lock (&LOCK_log)
        1.2 - mutex_lock (&LOCK_prep_xids)
        1.3 - increase prepared_xids
        1.4 - mutex_unlock (&LOCK_prep_xids)
        1.5 - mutex_unlock (&LOCK_log)
      
      2 - "Flush non-trx changes" (MYSQL_BIN_LOG::write) - T1:
        2.1 - mutex_lock (&LOCK_log)
        2.2 - mutex_unlock (&LOCK_log)
      
      3. "unlog" - T1
        3.1 - mutex_lock (&LOCK_prep_xids)
        3.2 - decrease prepared xids
        3.3 - pthread_cond_signal(&COND_prep_xids);
        3.4 - mutex_unlock (&LOCK_prep_xids)
      
      The "FLUSH logs" command produces the following sequence of events:
      
      1 - "FLUSH logs" command (MYSQL_BIN_LOG::new_file_impl) - user thread:
        1.1 - mutex_lock (&LOCK_log)
        1.2 - mutex_lock (&LOCK_prep_xids)
        1.3 - while (prepared_xids)  pthread_cond_wait(..., &LOCK_prep_xids);
        1.4 - mutex_unlock (&LOCK_prep_xids)
        1.5 - mutex_unlock (&LOCK_log)
      
      A deadlock will arise if T1 flushes the trx changes and thus increases
      prepared_xids but before it is able to continue the execution and flush the
      non-trx changes, an user thread calls the "FLUSH logs" command and wait that
      the prepared_xids is decreased and gets to zero. However, T1 cannot proceed
      with the call to "Flush non-trx changes" because it will block in the mutex
      "LOCK_log" and by consequence cannot complete the execution and call the
      unlog to decrease the prepared_xids.
      
      To fix the problem, we ensure that the non-trx changes are always flushed
      before the trx changes.
      
      Note that if you call "Flush non-trx changes" and a concurrent "FLUSH logs" is
      issued, the "Flush non-trx changes" may block, but a deadlock will never happen
      because the prepared_xids will eventually get to zero. Bottom line, there will
      not be any transaction able to increase the prepared_xids because they will
      block in the mutex "LOCK_log" (MYSQL_BIN_LOG::write) and those that increased
      the prepared_xids will eventually commit and decrease the prepared_xids.

Pushed into 6.0.14-alpha (revid:alik@ibmvm-20100118120357-hnzhgadkpzqfnvsc) (version source revid:alik@ibmvm-20100118115413-kd3klpine09yyktw) (merge vers: 6.0.14-alpha) (pib:16)

Pushed into mysql-next-mr (revid:alik@ibmvm-20100118120111-73dulkgc893it4r9) (version source revid:alik@ibmvm-20100118115335-0stecyzftqm7bqx6) (pib:16)

Documented bugfix in the 5.6.0 and 6.0.14 changelogs as follows:

        The server could deadlock when FLUSH LOGS was executed
        concurrently with DML statements. To fix this problem,
        non-transactional changes are now always flushed before
        transactional changes.

Set NDI state, waiting for 5.1 merge.

Closed per IRC discussion with Alfranio, no further merges expected.

Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@linux-rbsx-20100118220048-5vnyqi5ghsbgmdsd) (merge vers: 5.5.99-m3) (pib:16)

Moved 5.6.0 changelog entry to 5.5.3.