Bug #42101 Race condition in innodb_commit_concurrency
Submitted: 14 Jan 2009 9:27 Modified: 20 Jun 2010 1:02
Reporter: Marko Mäkelä Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.1 OS:Any
Assigned to: Satya B CPU Architecture:Any

[14 Jan 2009 9:27] Marko Mäkelä
Description:
If the global variable innodb_commit_concurrency is set to 0 while innobase_commit() is being executed, some threads may remain stuck in pthread_cond_wait(&commit_cond, &commit_cond_m).

Furthermore, the symbol srv_commit_concurrency could be declared static in ha_innodb.cc, as it is not referenced anywhere else. And the documentation of the variable innodb_commit_concurrency is lacking. It should say something like "The maximum number of concurrent commits inside InnoDB (0 means unlimited)" in addition to the current comment "Helps in performance tuning in heavily concurrent environments."

How to repeat:
Set innodb_commit_concurrency=1, start a commit-intensive workload with many threads, and set innodb_commit_concurrency=0.

Suggested fix:
Assign the result to a local variable in innobase_commit() or write an update callback for innodb_commit_concurrency that will pthread_signal(&commit_cond), or do both.
[14 May 2009 9:09] Marko Mäkelä
Simple fix #1: Remove the option innodb_commit_concurrency, or make it a no-op.

Simple fix #2: Remove the checks for commit_concurrency > 0 in innobase_commit() in ha_innodb.cc. Penalty: innobase_commit() will acquire and release commit_cond_m twice, likely reducing performance.

Complex fix #1: Remove the checks for commit_concurrency > 0, but try replacing the mutex operations with an atomic increment or decrement. This is tricky because of the pthread_cond_signal().

Complex fix #2: Write an innodb_commit_concurrency update callback that will do something special when the value is changed to or from 0, such as update innobase_hton->commit to one of two versions of innobase_commit().
[14 May 2009 9:46] Marko Mäkelä
Simple fix #3: Write an update callback function that prevents updating innodb_commit_concurrency from zero to nonzero or vice versa. Such updates would require a server restart.
[14 May 2009 11:38] Marko Mäkelä
Simple fix #3 is now in review.
[19 May 2009 8:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/74460

2891 Satya B	2009-05-19
      Applying InnoDB snashot 5.1-ss5024,part 3. Fixes BUG#42101
      
      BUG#42101 - Race condition in innodb_commit_concurrency
      
      Detailed revision comments:
      
      r4994 | marko | 2009-05-14 15:04:55 +0300 (Thu, 14 May 2009) | 18 lines
      branches/5.1: Prevent a race condition in innobase_commit() by ensuring
      that innodb_commit_concurrency>0 remains constant at run time. (Bug #42101)
      
      srv_commit_concurrency: Make this a static variable in ha_innodb.cc.
      
      innobase_commit_concurrency_validate(): Check that innodb_commit_concurrency
      is not changed from or to 0 at run time.  This is needed, because
      innobase_commit() assumes that innodb_commit_concurrency>0 remains constant.
      Without this limitation, the checks for innodb_commit_concurrency>0
      in innobase_commit() should be removed and that function would have to
      acquire and release commit_cond_m at least twice per invocation.
      Normally, innodb_commit_concurrency=0, and introducing the mutex operations
      would mean significant overhead.
      
      innodb_bug42101.test, innodb_bug42101-nonzero.test: Test cases.
      
      rb://123 approved by Heikki Tuuri
      added:
        mysql-test/r/innodb_bug42101-nonzero.result
        mysql-test/r/innodb_bug42101.result
        mysql-test/t/innodb_bug42101-nonzero-master.opt
        mysql-test/t/innodb_bug42101-nonzero.test
        mysql-test/t/innodb_bug42101.test
      modified:
        storage/innobase/handler/ha_innodb.cc
        storage/innobase/include/srv0srv.h
        storage/innobase/srv/srv0srv.c
[19 May 2009 8:46] Satya B
Notes to Docs team:
Patch queued to 5.1-bugteam.  NOT yet in 6.0 (5.1-snapshot-ss5024 is null merged
into 6.0). Please return to "Patch approved" after documenting until 6.0
snapshot is available.
[28 May 2009 8:16] Bugs System
Pushed into 5.1.36 (revid:joro@sun.com-20090528073639-yohsb4q1jzg7ycws) (version source revid:satya.bn@sun.com-20090519082028-7fsv4ynezif2io7b) (merge vers: 5.1.36) (pib:6)
[2 Jun 2009 6:18] Marko Mäkelä
The fix prevents SET GLOBAL innodb_commit_concurrency when it would change the "nonzero" status of innodb_commit_concurrency. You can still change the parameter from nonzero to another nonzero value.

Before the fix, changing the parameter from nonzero to zero or from zero to nonzero is likely to trigger race conditions or deadlocks when there are any ongoing InnoDB transactions.
[2 Jun 2009 16:38] Paul DuBois
Noted in 5.1.36 changelog.

There was a race condition when changing innodb_commit_concurrency at
runtime from zero to non-zero or from non-zero to zero. Now this
variable cannot be changed at runtime from zero to non-zero or vice 
versa. The value can still be changed from one non-zero value to
another. 

Setting report to Patch approved pending push into 6.0.x.
[17 Jun 2009 19:23] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090616183122-chjzbaa30qopdra9) (version source revid:satya.bn@sun.com-20090519083733-iksol2m6mcevrsgy) (merge vers: 6.0.12-alpha) (pib:11)
[25 Jun 2009 12:12] Marko Mäkelä
See also the follow-up Bug #45749.
[13 Jul 2009 17:49] Bugs System
Pushed into 5.1.37 (revid:joro@sun.com-20090713174543-cd2x7q1gi1hzoand) (version source revid:staale.smedseng@sun.com-20090710151930-6e6kq5tp7ux1rtbh) (merge vers: 5.1.37) (pib:11)
[29 Jul 2009 21:12] Davi Arnaut
Setting to Documenting at it seems this needs to be documented in 5.4 and closed.
[3 Aug 2009 18:31] Paul DuBois
Noted in 5.4.4 changelog.
[4 Aug 2009 5:31] Satya B
The patch is not really pushed to 5.4. I just did a NULL merge.

I am not sure why this message was generated  at all

[17 Jun 21:23] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090616183122-chjzbaa30qopdra9) (version
source revid:satya.bn@sun.com-20090519083733-iksol2m6mcevrsgy) (merge vers: 6.0.12-alpha)
(pib:11)

This is misleading and is there a way to find out why this was generated and 
the changeset related to this bug?
[4 Aug 2009 19:51] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090804194615-h40sa098mx4z49qg) (version source revid:satya.bn@sun.com-20090710114411-lthb9gmm9l2ui9ig) (merge vers: 5.4.4-alpha) (pib:11)
[12 Aug 2009 22:14] Paul DuBois
Removed 5.4.x changelog entry.
[26 Aug 2009 13:46] Bugs System
Pushed into 5.1.37-ndb-7.0.8 (revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[26 Aug 2009 13:46] Bugs System
Pushed into 5.1.37-ndb-6.3.27 (revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (version source revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (merge vers: 5.1.37-ndb-6.3.27) (pib:11)
[26 Aug 2009 13:48] Bugs System
Pushed into 5.1.37-ndb-6.2.19 (revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (version source revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (merge vers: 5.1.37-ndb-6.2.19) (pib:11)
[27 Aug 2009 16:32] Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:magnus.blaudd@sun.com-20090827163030-6o3kk6r2oua159hr) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[5 May 2010 15:11] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 16:58] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[28 May 2010 6:01] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:30] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:58] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 23:08] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[15 Jun 2010 8:13] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (merge vers: 5.1.47) (pib:16)
[15 Jun 2010 8:28] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (pib:16)
[17 Jun 2010 12:05] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:49] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:32] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)