Bug #49536 Mutex deadlock when expire_logs_days is used
Submitted: 8 Dec 2009 16:23 Modified: 15 Mar 2010 10:52
Reporter: Andrew Hutchings Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Replication Severity:S1 (Critical)
Version: OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[8 Dec 2009 16:23] Andrew Hutchings
Description:
When expire_logs_days is set the thread doing the expiry can cause a mutex deadlock stopping all binlog operations.  Only happens with cluster mysqld because the lock gets stuck with ndb_binlog_index.

Backtrace to be attached shortly.

How to repeat:
1. Set expire_logs_days=1
2. flush logs
3. set system date +1 day
4. flush logs

* Hangs here *
[8 Dec 2009 16:24] Andrew Hutchings
Backtrace of thread causing issue (upon execution of FLUSH LOGS)

Attachment: bt.txt (text/plain), 3.15 KiB.

[9 Dec 2009 10:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/93294

3189 Jonas Oreland	2009-12-09
      bug#49536 - deadlock on rotate_and_purge when using expire_logs_days
        Problem is that purge_logs implementation in ndb (ndbcluster_binlog_index_purge_file)
          calls mysql_parse (with (thd->options & OPTION_BIN_LOG) === 0)) 
          but MYSQL_BIN_LOG first takes LOCK_log and then checks thd->options
      
        Solution in this patch, changes so that rotate_and_purge does not hold
          LOCK_log when calling purge_logs_before_date. I think this is safe
          as other "purge"-function(s) is called wo/ holding LOCK_log, e.g purge_master_logs
[9 Dec 2009 10:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/93295

3189 Jonas Oreland	2009-12-09
      bug#49536 - deadlock on rotate_and_purge when using expire_logs_days
        Problem is that purge_logs implementation in ndb (ndbcluster_binlog_index_purge_file)
          calls mysql_parse (with (thd->options & OPTION_BIN_LOG) === 0)) 
          but MYSQL_BIN_LOG first takes LOCK_log and then checks thd->options
      
        Solution in this patch, changes so that rotate_and_purge does not hold
          LOCK_log when calling purge_logs_before_date. I think this is safe
          as other "purge"-function(s) is called wo/ holding LOCK_log, e.g purge_master_logs
[9 Dec 2009 20:28] Andrew Hutchings
Build fails with patch when building embedded mysqld library because HAVE_REPLICATION is not defined so bool check_purge is not defined, but it is used a few lines below.
[14 Dec 2009 8:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/93851

3047 Jonas Oreland	2009-12-14
      bug#49536 - deadlock on rotate_and_purge when using expire_logs_days
        Problem is that purge_logs implementation in ndb 
          (ndbcluster_binlog_index_purge_file) calls mysql_parse
          (with (thd->options & OPTION_BIN_LOG) === 0)) but MYSQL_BIN_LOG first 
          takes LOCK_log and then checks thd->options
            
        Solution in this patch, changes so that rotate_and_purge does not hold
          LOCK_log when calling purge_logs_before_date.
[14 Dec 2009 8:52] Jonas Oreland
pushed to 6.2.19, 6.3.29 and 7.0.10
(not pushed to mainline yet)
[14 Dec 2009 19:58] Jon Stephens
Documented bugfix in the NDB 6.2.19, 6.3.29, and 7.0.10 changelogs, as follows:

      When expire_logs_days was set, the thread performing the purge of the
      log files could deadlock, causing all binary log operations to stop.

Closed.
[15 Dec 2009 6:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/94035

3272 He Zhenxing	2009-12-15
      bug#49536 - deadlock on rotate_and_purge when using expire_logs_days
      Problem is that purge_logs implementation in ndb (ndbcluster_binlog_index_purge_file)
      calls mysql_parse (with (thd->options & OPTION_BIN_LOG) === 0)) 
      but MYSQL_BIN_LOG first takes LOCK_log and then checks thd->options
      
      Solution in this patch, changes so that rotate_and_purge does not hold
      LOCK_log when calling purge_logs_before_date. I think this is safe
      as other "purge"-function(s) is called wo/ holding LOCK_log, e.g purge_master_logs
[15 Dec 2009 22:36] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/94310

3052 Martin Skold	2009-12-15 [merge]
      Merge
      modified:
        libmysql/libmysql.c
        sql/log.cc
        sql/log_event.cc
[19 Dec 2009 8:26] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091219082307-f3i4fn0tm8trb3c0) (version source revid:alik@sun.com-20091216180721-eoa754i79j4ssd3m) (merge vers: 6.0.14-alpha) (pib:15)
[19 Dec 2009 8:30] Bugs System
Pushed into 5.5.1-m2 (revid:alik@sun.com-20091219082021-f34nq4jytwamozz0) (version source revid:alexey.kopytov@sun.com-20091216134707-o96eqw0u2ynvo9gm) (merge vers: 5.5.0-beta) (pib:15)
[19 Dec 2009 8:33] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20091219082213-nhjjgmphote4ntxj) (version source revid:alik@sun.com-20091216180221-a5ps59gajad3pip9) (pib:15)
[19 Dec 2009 11:46] Jon Stephens
Also noted in the 5.5.1 and 6.0.14 changelogs.

Still waiting for merge to 5.1 tree -> NDI.
[15 Jan 2010 8:58] Bugs System
Pushed into 5.1.43 (revid:joro@sun.com-20100115085139-qkh0i0fpohd9u9p5) (version source revid:zhenxing.he@sun.com-20091215064828-es8ucbh59n91n79a) (merge vers: 5.1.42) (pib:16)
[18 Jan 2010 12:16] Jon Stephens
Also documented in the 5.1.43 changelog. Closed.
[12 Mar 2010 14:05] Bugs System
Pushed into 5.1.44-ndb-7.0.14 (revid:jonas@mysql.com-20100312135944-t0z8s1da2orvl66x) (version source revid:jonas@mysql.com-20100312115609-woou0te4a6s4ae9y) (merge vers: 5.1.44-ndb-7.0.14) (pib:16)
[12 Mar 2010 14:21] Bugs System
Pushed into 5.1.44-ndb-6.2.19 (revid:jonas@mysql.com-20100312134846-tuqhd9w3tv4xgl3d) (version source revid:jonas@mysql.com-20100312060623-mx6407w2vx76h3by) (merge vers: 5.1.44-ndb-6.2.19) (pib:16)
[12 Mar 2010 14:35] Bugs System
Pushed into 5.1.44-ndb-6.3.33 (revid:jonas@mysql.com-20100312135724-xcw8vw2lu3mijrhn) (version source revid:jonas@mysql.com-20100312103652-snkltsd197l7q2yg) (merge vers: 5.1.44-ndb-6.3.33) (pib:16)
[15 Mar 2010 10:52] Jon Stephens
No new changelog entries required. Closed.