Bug #56715 Concurrent transactions + FLUSH result in sporadical unwarranted deadlock errors
Submitted: 10 Sep 2010 14:32 Modified: 20 Nov 2010 22:51
Reporter: Dmitry Lenev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:5.5.6-m3 OS:Any
Assigned to: Dmitry Lenev CPU Architecture:Any

[10 Sep 2010 14:32] Dmitry Lenev
Description:
After bug Bug #56405 "Deadlock in the MDL deadlock detector" was fixed transactions which run concurrently with FLUSH TABLES statement may sporadically return ER_LOCK_DEADLOCK error even though no deadlock exist and deadlock search depth was not exceeded.

This is a negative consequence of fix for bug #56405.

The issue occurs when two transactions simultaneously do deadlock detection for MDL locks/flushes and  simultaneously try acquire LOCK_open mutex. The loser thread is declared to cause deadlock and its statement is aborted with unwarranted ER_LOCK_DEADLOCK.

How to repeat:
The following script for MTR environment occasionally aborted with unexpected ER_LOCK_DEADLOCK error for me.

###############################################

connect (con1,localhost,root,,test,,);
connect (con2,localhost,root,,test,,);
connect (con3,localhost,root,,test,,);
connect (con4,localhost,root,,test,,);
connect (con5,localhost,root,,test,,);

connection default;
create table t1 (i int);
create table t2 (i int);

connection con1;
lock table t1 read;

connection con2;
--send flush tables

connection con3;
let $wait_condition=
  select count(*) = 1 from information_schema.processlist
  where state = "Waiting for table flush" and
        info = "flush tables";
--source include/wait_condition.inc
--send alter table t1 add column j int;

--disable_query_log
--disable_result_log
let $wait_counter= 300;

while ($wait_counter)
{
  connection con4;
  begin;
  select * from t2;
  set @@lock_wait_timeout=1;

  connection con5;
  begin;
  select * from t2;
  set @@lock_wait_timeout=1;

  connection con4;
  --send insert into t1 values ()

  connection con5;
  --send insert into t1 values ()

  connection con4;
  --error ER_LOCK_WAIT_TIMEOUT
  --reap
  set @@lock_wait_timeout=default;
  commit;

  connection con5;
  --error ER_LOCK_WAIT_TIMEOUT
  --reap
  commit;
  set @@lock_wait_timeout=default;
  dec $wait_counter;
}

--enable_result_log
--enable_query_log

connection con1;
unlock tables;

connection con2;
--reap

connection con3;
--reap

connection default;
drop tables t1, t2;

#############################################################

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009

==============================================================================

TEST                                      RESULT   TIME (ms)
------------------------------------------------------------

create table t1 (i int);
create table t2 (i int);
lock table t1 read;
flush tables;
alter table t1 add column j int;;
main.1                                   [ fail ]
        Test ended at 2010-09-10 18:05:04

CURRENT_TEST: main.1
mysqltest: At line 61: query 'reap' failed with wrong errno 1213: 'Deadlock found when trying to get lock; try restarting transaction', instead of 1205...
[10 Sep 2010 22:40] Sveta Smirnova
Thank you for the report.

Verified as described.
[23 Sep 2010 20:04] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118974

3142 Dmitry Lenev	2010-09-23
      A better fix for bug #56405 "Deadlock in the MDL deadlock
      detector", which doesn't introduce bug #56715 "Concurrent
      transactions + FLUSH result in sporadical unwarranted
      deadlock errors".
      
      Deadlock could have occurred when workload containing mix
      of DML, DDL and FLUSH TABLES statements affecting same
      set of tables was executed in heavily concurrent environment.
      
      This deadlock occurred when several connections tried to
      perform deadlock detection in metadata locking subsystem.
      The first connection started traversing wait-for graph,
      encountered sub-graph representing wait for flush, acquired
      LOCK_open and dived into sub-graph inspection. When it has
      encountered sub-graph corresponding to wait for metadata lock
      and blocked while trying to acquire rd-lock on
      MDL_lock::m_rwlock protecting this subgraph, since some
      other thread had wr-lock on it. When this wr-lock was released
      it could have happened (if there was other pending wr-lock
      against this rwlock) that rd-lock from the first connection
      was left unsatisfied but at the same time new rd-lock request
      from the second connection sneaked in and was satisfied (for
      this to be possible second rd- request should come exactly
      after wr-lock is released but before pending wr-lock manages
      to grab rwlock, which is possible both on Linux and in our
      own rwlock implementation). If this second connection
      continued traversing wait-for graph and encountered sub-graph
      representing wait for flush it tried to acquire LOCK_open
      and thus deadlock was created.
      
      The previous patch tried to workaround this problem by not
      allowing deadlock detector to lock LOCK_open mutex if some
      other thread doing deadlock detection already owns it and
      current search depth is greater than 0. Instead deadlock
      was reported. As result it has introduced bug #56715.
      
      This patch solves this problem in a different way.
      It introduces a new rw_pr_lock_t implementation to be used
      by MDL subsystem instead of one based on Linux rwlocks or
      our own rwlock implementation. This new implementation
      never allows situation in which rwlock is rd-locked and
      there is a blocked pending rd-lock. Thus situation which
      has caused this bug becomes impossible with it.
      
      Due to fact that this implementation is optimized for
      wr-lock/unlock scenario which is most common in MDL
      subsystem it doesn't introduce noticiable performance
      regressions in sysbench tests. Moreover it significantly
      improves situation for POINT_SELECT test when many
      connections are used.
      
      No test case is provided as this bug is very hard to repeat
      in MTR environment but is repeatable with the help of RQG
      tests.
      This patch also doesn't include test for bug #56715
      "Concurrent transactions + FLUSH result in sporadical
      unwarranted deadlock errors" as it takes too much time to
      be run as part of normal test-suite runs.
      
      QQ: Should we also remove support for preferring readers
          from my_rw_lock_t implementation?
     @ config.h.cmake
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ configure.cmake
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ configure.in
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ include/my_pthread.h
        Introduced new implementation of rw_pr_lock_t.
        Since it never allows situation in which rwlock is rd-locked
        and there is a blocked pending rd-lock it is not affected by
        bug #56405 "Deadlock in the MDL deadlock detector".
        This implementation is also optimized for wr-lock/unlock
        scenario which is most common in MDL subsystem. So it doesn't
        introduce noticiable performance regressions in sysbench tests
        (compared to old Linux-specific implementation). Moreover it
        significantly improves situation for POINT_SELECT test when
        many connections are used.
     @ mysys/thr_rwlock.c
        Introduced new implementation of rw_pr_lock_t.
        Since it never allows situation in which rwlock is rd-locked
        and there is a blocked pending rd-lock it is not affected by
        bug #56405 "Deadlock in the MDL deadlock detector".
        This implementation is also optimized for wr-lock/unlock
        scenario which is most common in MDL subsystem. So it doesn't
        introduce noticiable performance regressions in sysbench tests
        (compared to old Linux-specific implementation). Moreover it
        significantly improves situation for POINT_SELECT test when
        many connections are used.
[27 Sep 2010 7:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119140

3142 Dmitry Lenev	2010-09-27
      A better fix for bug #56405 "Deadlock in the MDL deadlock
      detector", which doesn't introduce bug #56715 "Concurrent
      transactions + FLUSH result in sporadical unwarranted
      deadlock errors".
      
      Deadlock could have occurred when workload containing mix
      of DML, DDL and FLUSH TABLES statements affecting same
      set of tables was executed in heavily concurrent environment.
      
      This deadlock occurred when several connections tried to
      perform deadlock detection in metadata locking subsystem.
      The first connection started traversing wait-for graph,
      encountered sub-graph representing wait for flush, acquired
      LOCK_open and dived into sub-graph inspection. When it has
      encountered sub-graph corresponding to wait for metadata lock
      and blocked while trying to acquire rd-lock on
      MDL_lock::m_rwlock protecting this subgraph, since some
      other thread had wr-lock on it. When this wr-lock was released
      it could have happened (if there was other pending wr-lock
      against this rwlock) that rd-lock from the first connection
      was left unsatisfied but at the same time new rd-lock request
      from the second connection sneaked in and was satisfied (for
      this to be possible second rd- request should come exactly
      after wr-lock is released but before pending wr-lock manages
      to grab rwlock, which is possible both on Linux and in our
      own rwlock implementation). If this second connection
      continued traversing wait-for graph and encountered sub-graph
      representing wait for flush it tried to acquire LOCK_open
      and thus deadlock was created.
      
      The previous patch tried to workaround this problem by not
      allowing deadlock detector to lock LOCK_open mutex if some
      other thread doing deadlock detection already owns it and
      current search depth is greater than 0. Instead deadlock
      was reported. As result it has introduced bug #56715.
      
      This patch solves this problem in a different way.
      It introduces a new rw_pr_lock_t implementation to be used
      by MDL subsystem instead of one based on Linux rwlocks or
      our own rwlock implementation. This new implementation
      never allows situation in which rwlock is rd-locked and
      there is a blocked pending rd-lock. Thus situation which
      has caused this bug becomes impossible with it.
      
      Due to fact that this implementation is optimized for
      wr-lock/unlock scenario which is most common in MDL
      subsystem it doesn't introduce noticiable performance
      regressions in sysbench tests. Moreover it significantly
      improves situation for POINT_SELECT test when many
      connections are used.
      
      No test case is provided as this bug is very hard to repeat
      in MTR environment but is repeatable with the help of RQG
      tests.
      This patch also doesn't include test for bug #56715
      "Concurrent transactions + FLUSH result in sporadical
      unwarranted deadlock errors" as it takes too much time to
      be run as part of normal test-suite runs.
      
      QQ: Should we also remove support for preferring readers
          from my_rw_lock_t implementation?
     @ config.h.cmake
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ configure.cmake
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ configure.in
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ include/my_pthread.h
        Introduced new implementation of rw_pr_lock_t.
        Since it never allows situation in which rwlock is rd-locked
        and there is a blocked pending rd-lock it is not affected by
        bug #56405 "Deadlock in the MDL deadlock detector".
        This implementation is also optimized for wr-lock/unlock
        scenario which is most common in MDL subsystem. So it doesn't
        introduce noticiable performance regressions in sysbench tests
        (compared to old Linux-specific implementation). Moreover it
        significantly improves situation for POINT_SELECT test when
        many connections are used.
        As part of this change removed try-lock part of API for
        this type of lock. It is not used in our code and it would
        be hard to implement correctly within constraints of new
        implementation.
     @ include/mysql/psi/mysql_thread.h
        Removed try-lock part of prlock API.
        It is not used in our code and it would be hard
        to implement correctly within constraints of new
        prlock implementation.
     @ mysys/thr_rwlock.c
        Introduced new implementation of rw_pr_lock_t.
        Since it never allows situation in which rwlock is rd-locked
        and there is a blocked pending rd-lock it is not affected by
        bug #56405 "Deadlock in the MDL deadlock detector".
        This implementation is also optimized for wr-lock/unlock
        scenario which is most common in MDL subsystem. So it doesn't
        introduce noticiable performance regressions in sysbench tests
        (compared to old Linux-specific implementation). Moreover it
        significantly improves situation for POINT_SELECT test when
        many connections are used.
[27 Sep 2010 9:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119154

3142 Dmitry Lenev	2010-09-27
      A better fix for bug #56405 "Deadlock in the MDL deadlock
      detector", which doesn't introduce bug #56715 "Concurrent
      transactions + FLUSH result in sporadical unwarranted
      deadlock errors".
      
      Deadlock could have occurred when workload containing mix
      of DML, DDL and FLUSH TABLES statements affecting same
      set of tables was executed in heavily concurrent environment.
      
      This deadlock occurred when several connections tried to
      perform deadlock detection in metadata locking subsystem.
      The first connection started traversing wait-for graph,
      encountered sub-graph representing wait for flush, acquired
      LOCK_open and dived into sub-graph inspection. When it has
      encountered sub-graph corresponding to wait for metadata lock
      and blocked while trying to acquire rd-lock on
      MDL_lock::m_rwlock protecting this subgraph, since some
      other thread had wr-lock on it. When this wr-lock was released
      it could have happened (if there was other pending wr-lock
      against this rwlock) that rd-lock from the first connection
      was left unsatisfied but at the same time new rd-lock request
      from the second connection sneaked in and was satisfied (for
      this to be possible second rd- request should come exactly
      after wr-lock is released but before pending wr-lock manages
      to grab rwlock, which is possible both on Linux and in our
      own rwlock implementation). If this second connection
      continued traversing wait-for graph and encountered sub-graph
      representing wait for flush it tried to acquire LOCK_open
      and thus deadlock was created.
      
      The previous patch tried to workaround this problem by not
      allowing deadlock detector to lock LOCK_open mutex if some
      other thread doing deadlock detection already owns it and
      current search depth is greater than 0. Instead deadlock
      was reported. As result it has introduced bug #56715.
      
      This patch solves this problem in a different way.
      It introduces a new rw_pr_lock_t implementation to be used
      by MDL subsystem instead of one based on Linux rwlocks or
      our own rwlock implementation. This new implementation
      never allows situation in which rwlock is rd-locked and
      there is a blocked pending rd-lock. Thus situation which
      has caused this bug becomes impossible with it.
      
      Due to fact that this implementation is optimized for
      wr-lock/unlock scenario which is most common in MDL
      subsystem it doesn't introduce noticiable performance
      regressions in sysbench tests. Moreover it significantly
      improves situation for POINT_SELECT test when many
      connections are used.
      
      No test case is provided as this bug is very hard to repeat
      in MTR environment but is repeatable with the help of RQG
      tests.
      This patch also doesn't include test for bug #56715
      "Concurrent transactions + FLUSH result in sporadical
      unwarranted deadlock errors" as it takes too much time to
      be run as part of normal test-suite runs.
     @ config.h.cmake
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ configure.cmake
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ configure.in
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ include/my_pthread.h
        Introduced new implementation of rw_pr_lock_t.
        Since it never allows situation in which rwlock is rd-locked
        and there is a blocked pending rd-lock it is not affected by
        bug #56405 "Deadlock in the MDL deadlock detector".
        This implementation is also optimized for wr-lock/unlock
        scenario which is most common in MDL subsystem. So it doesn't
        introduce noticiable performance regressions in sysbench tests
        (compared to old Linux-specific implementation). Moreover it
        significantly improves situation for POINT_SELECT test when
        many connections are used.
        As part of this change removed try-lock part of API for
        this type of lock. It is not used in our code and it would
        be hard to implement correctly within constraints of new
        implementation.
        Finally, removed support of preferring readers from
        my_rw_lock_t implementation as the only user of this
        feature was old rw_pr_lock_t implementation.
     @ include/mysql/psi/mysql_thread.h
        Removed try-lock part of prlock API.
        It is not used in our code and it would be hard
        to implement correctly within constraints of new
        prlock implementation.
     @ mysys/thr_rwlock.c
        Introduced new implementation of rw_pr_lock_t.
        Since it never allows situation in which rwlock is rd-locked
        and there is a blocked pending rd-lock it is not affected by
        bug #56405 "Deadlock in the MDL deadlock detector".
        This implementation is also optimized for wr-lock/unlock
        scenario which is most common in MDL subsystem. So it doesn't
        introduce noticiable performance regressions in sysbench tests
        (compared to old Linux-specific implementation). Moreover it
        significantly improves situation for POINT_SELECT test when
        many connections are used.
        Also removed support of preferring readers from
        my_rw_lock_t implementation as the only user of this
        feature was old rw_pr_lock_t implementation.
[29 Sep 2010 12:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119406

3148 Dmitry Lenev	2010-09-29
      A better fix for bug #56405 "Deadlock in the MDL deadlock
      detector" that doesn't introduce bug #56715 "Concurrent
      transactions + FLUSH result in sporadical unwarranted
      deadlock errors".
      
      Deadlock could have occurred when workload containing a mix
      of DML, DDL and FLUSH TABLES statements affecting the same
      set of tables was executed in a heavily concurrent environment.
      
      This deadlock occurred when several connections tried to
      perform deadlock detection in the metadata locking subsystem.
      The first connection started traversing wait-for graph,
      encountered a sub-graph representing a wait for flush, acquired
      LOCK_open and dived into sub-graph inspection. Then it
      encountered sub-graph corresponding to wait for metadata lock
      and blocked while trying to acquire a rd-lock on
      MDL_lock::m_rwlock, since some,other thread had a wr-lock on it.
      When this wr-lock was released it could have happened (if there
      was another pending wr-lock against this rwlock) that the rd-lock
      from the first connection was left unsatisfied but at the same
      time the new rd-lock request from the second connection sneaked
      in and was satisfied (for this to be possible the second
      rd-request should come exactly after the wr-lock is released but
      before pending the wr-lock manages to grab rwlock, which is
      possible both on Linux and in our own rwlock implementation).
      If this second connection continued traversing the wait-for graph
      and encountered a sub-graph representing a wait for flush it tried
      to acquire LOCK_open and thus the deadlock was created.
      
      The previous patch tried to workaround this problem by not
      allowing the deadlock detector to lock LOCK_open mutex if
      some other thread doing deadlock detection already owns it
      and current search depth is greater than 0. Instead deadlock
      was reported. As a result it has introduced bug #56715.
      
      This patch solves this problem in a different way.
      It introduces a new rw_pr_lock_t implementation to be used
      by MDL subsystem instead of one based on Linux rwlocks or
      our own rwlock implementation. This new implementation
      never allows situation in which an rwlock is rd-locked and
      there is a blocked pending rd-lock. Thus the situation which
      has caused this bug becomes impossible with this implementation.
      
      Due to fact that this implementation is optimized for
      wr-lock/unlock scenario which is most common in the MDL
      subsystem it doesn't introduce noticeable performance
      regressions in sysbench tests. Moreover it significantly
      improves situation for POINT_SELECT test when many
      connections are used.
      
      No test case is provided as this bug is very hard to repeat
      in MTR environment but is repeatable with the help of RQG
      tests.
      This patch also doesn't include a test for bug #56715
      "Concurrent transactions + FLUSH result in sporadical
      unwarranted deadlock errors" as it takes too much time to
      be run as part of normal test-suite runs.
     @ config.h.cmake
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ configure.cmake
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ configure.in
        We no longer need to check for presence of
        pthread_rwlockattr_setkind_np as we no longer
        use Linux-specific implementation of rw_pr_lock_t
        which uses this function.
     @ include/my_pthread.h
        Introduced new implementation of rw_pr_lock_t.
        Since it never allows situation in which rwlock is rd-locked
        and there is a blocked pending rd-lock it is not affected by
        bug #56405 "Deadlock in the MDL deadlock detector".
        This implementation is also optimized for wr-lock/unlock
        scenario which is most common in MDL subsystem. So it doesn't
        introduce noticiable performance regressions in sysbench tests
        (compared to old Linux-specific implementation). Moreover it
        significantly improves situation for POINT_SELECT test when
        many connections are used.
        As part of this change removed try-lock part of API for
        this type of lock. It is not used in our code and it would
        be hard to implement correctly within constraints of new
        implementation.
        Finally, removed support of preferring readers from
        my_rw_lock_t implementation as the only user of this
        feature was old rw_pr_lock_t implementation.
     @ include/mysql/psi/mysql_thread.h
        Removed try-lock part of prlock API.
        It is not used in our code and it would be hard
        to implement correctly within constraints of new
        prlock implementation.
     @ mysys/thr_rwlock.c
        Introduced new implementation of rw_pr_lock_t.
        Since it never allows situation in which rwlock is rd-locked
        and there is a blocked pending rd-lock it is not affected by
        bug #56405 "Deadlock in the MDL deadlock detector".
        This implementation is also optimized for wr-lock/unlock
        scenario which is most common in MDL subsystem. So it doesn't
        introduce noticiable performance regressions in sysbench tests
        (compared to old Linux-specific implementation). Moreover it
        significantly improves situation for POINT_SELECT test when
        many connections are used.
        Also removed support of preferring readers from
        my_rw_lock_t implementation as the only user of this
        feature was old rw_pr_lock_t implementation.
[29 Sep 2010 12:16] Dmitry Lenev
Fix for this bug was queued into mysql-5.5-runtime tree.
[9 Nov 2010 19:48] Bugs System
Pushed into mysql-5.5 5.5.7-rc (revid:sunanda.menon@sun.com-20101109182959-otkxq8vo2dcd13la) (version source revid:marko.makela@oracle.com-20100824081003-v4ecy0tga99cpxw2) (merge vers: 5.1.50) (pib:21)
[12 Nov 2010 0:34] Paul DuBois
Noted in 5.5.7 changelog.

Deadlock could occur for a workload consisting of a mix of DML, DDL,
and FLUSH TABLES statements affecting the same set of tables in a
heavily concurrent environment.
[13 Nov 2010 16:08] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:marko.makela@oracle.com-20100824081003-v4ecy0tga99cpxw2) (merge vers: 5.1.50) (pib:21)
[13 Nov 2010 16:39] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:marko.makela@oracle.com-20100824081003-v4ecy0tga99cpxw2) (pib:21)