Bug #42101 Race condition in innodb_commit_concurrency
Submitted: 14 Jan 10:27 Modified: 3 Aug 20:31
Reporter: Marko Mäkelä
Status: Closed
Category:Server: InnoDB Severity:S3 (Non-critical)
Version:5.1 OS:Any
Assigned to: Satya B Target Version:5.1+
Triage: Triaged: D2 (Serious) / R1 (None/Negligible) / E2 (Low)

[14 Jan 10:27] Marko Mäkelä
Description:
If the global variable innodb_commit_concurrency is set to 0 while innobase_commit() is
being executed, some threads may remain stuck in pthread_cond_wait(&commit_cond,
&commit_cond_m).

Furthermore, the symbol srv_commit_concurrency could be declared static in ha_innodb.cc,
as it is not referenced anywhere else. And the documentation of the variable
innodb_commit_concurrency is lacking. It should say something like "The maximum number of
concurrent commits inside InnoDB (0 means unlimited)" in addition to the current comment
"Helps in performance tuning in heavily concurrent environments."

How to repeat:
Set innodb_commit_concurrency=1, start a commit-intensive workload with many threads, and
set innodb_commit_concurrency=0.

Suggested fix:
Assign the result to a local variable in innobase_commit() or write an update callback
for innodb_commit_concurrency that will pthread_signal(&commit_cond), or do both.
[14 May 11:09] Marko Mäkelä
Simple fix #1: Remove the option innodb_commit_concurrency, or make it a no-op.

Simple fix #2: Remove the checks for commit_concurrency > 0 in innobase_commit() in
ha_innodb.cc. Penalty: innobase_commit() will acquire and release commit_cond_m twice,
likely reducing performance.

Complex fix #1: Remove the checks for commit_concurrency > 0, but try replacing the mutex
operations with an atomic increment or decrement. This is tricky because of the
pthread_cond_signal().

Complex fix #2: Write an innodb_commit_concurrency update callback that will do something
special when the value is changed to or from 0, such as update innobase_hton->commit to
one of two versions of innobase_commit().
[14 May 11:46] Marko Mäkelä
Simple fix #3: Write an update callback function that prevents updating
innodb_commit_concurrency from zero to nonzero or vice versa. Such updates would require
a server restart.
[14 May 13:38] Marko Mäkelä
Simple fix #3 is now in review.
[19 May 10:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/74460

2891 Satya B	2009-05-19
      Applying InnoDB snashot 5.1-ss5024,part 3. Fixes BUG#42101
      
      BUG#42101 - Race condition in innodb_commit_concurrency
      
      Detailed revision comments:
      
      r4994 | marko | 2009-05-14 15:04:55 +0300 (Thu, 14 May 2009) | 18 lines
      branches/5.1: Prevent a race condition in innobase_commit() by ensuring
      that innodb_commit_concurrency>0 remains constant at run time. (Bug #42101)
      
      srv_commit_concurrency: Make this a static variable in ha_innodb.cc.
      
      innobase_commit_concurrency_validate(): Check that innodb_commit_concurrency
      is not changed from or to 0 at run time.  This is needed, because
      innobase_commit() assumes that innodb_commit_concurrency>0 remains constant.
      Without this limitation, the checks for innodb_commit_concurrency>0
      in innobase_commit() should be removed and that function would have to
      acquire and release commit_cond_m at least twice per invocation.
      Normally, innodb_commit_concurrency=0, and introducing the mutex operations
      would mean significant overhead.
      
      innodb_bug42101.test, innodb_bug42101-nonzero.test: Test cases.
      
      rb://123 approved by Heikki Tuuri
      added:
        mysql-test/r/innodb_bug42101-nonzero.result
        mysql-test/r/innodb_bug42101.result
        mysql-test/t/innodb_bug42101-nonzero-master.opt
        mysql-test/t/innodb_bug42101-nonzero.test
        mysql-test/t/innodb_bug42101.test
      modified:
        storage/innobase/handler/ha_innodb.cc
        storage/innobase/include/srv0srv.h
        storage/innobase/srv/srv0srv.c
[19 May 10:46] Satya B
Notes to Docs team:
Patch queued to 5.1-bugteam.  NOT yet in 6.0 (5.1-snapshot-ss5024 is null merged
into 6.0). Please return to "Patch approved" after documenting until 6.0
snapshot is available.
[28 May 10:16] Bugs System
Pushed into 5.1.36 (revid:joro@sun.com-20090528073639-yohsb4q1jzg7ycws) (version source
revid:satya.bn@sun.com-20090519082028-7fsv4ynezif2io7b) (merge vers: 5.1.36) (pib:6)
[2 Jun 8:18] Marko Mäkelä
The fix prevents SET GLOBAL innodb_commit_concurrency when it would change the "nonzero"
status of innodb_commit_concurrency. You can still change the parameter from nonzero to
another nonzero value.

Before the fix, changing the parameter from nonzero to zero or from zero to nonzero is
likely to trigger race conditions or deadlocks when there are any ongoing InnoDB
transactions.
[2 Jun 18:38] Paul DuBois
Noted in 5.1.36 changelog.

There was a race condition when changing innodb_commit_concurrency at
runtime from zero to non-zero or from non-zero to zero. Now this
variable cannot be changed at runtime from zero to non-zero or vice 
versa. The value can still be changed from one non-zero value to
another. 

Setting report to Patch approved pending push into 6.0.x.
[17 Jun 21:23] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090616183122-chjzbaa30qopdra9) (version
source revid:satya.bn@sun.com-20090519083733-iksol2m6mcevrsgy) (merge vers: 6.0.12-alpha)
(pib:11)
[25 Jun 14:12] Marko Mäkelä
See also the follow-up Bug #45749.
[13 Jul 19:49] Bugs System
Pushed into 5.1.37 (revid:joro@sun.com-20090713174543-cd2x7q1gi1hzoand) (version source
revid:staale.smedseng@sun.com-20090710151930-6e6kq5tp7ux1rtbh) (merge vers: 5.1.37)
(pib:11)
[29 Jul 23:12] Davi Arnaut
Setting to Documenting at it seems this needs to be documented in 5.4 and closed.
[3 Aug 20:31] Paul DuBois
Noted in 5.4.4 changelog.
[4 Aug 7:31] Satya B
The patch is not really pushed to 5.4. I just did a NULL merge.

I am not sure why this message was generated  at all

[17 Jun 21:23] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090616183122-chjzbaa30qopdra9) (version
source revid:satya.bn@sun.com-20090519083733-iksol2m6mcevrsgy) (merge vers:
6.0.12-alpha)
(pib:11)

This is misleading and is there a way to find out why this was generated and 
the changeset related to this bug?
[4 Aug 21:51] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090804194615-h40sa098mx4z49qg) (version
source revid:satya.bn@sun.com-20090710114411-lthb9gmm9l2ui9ig) (merge vers: 5.4.4-alpha)
(pib:11)
[13 Aug 0:14] Paul DuBois
Removed 5.4.x changelog entry.
[26 Aug 15:46] Bugs System
Pushed into 5.1.37-ndb-7.0.8 (revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l)
(version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers:
5.1.37-ndb-7.0.8) (pib:11)
[26 Aug 15:46] Bugs System
Pushed into 5.1.37-ndb-6.3.27 (revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc)
(version source revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (merge vers:
5.1.37-ndb-6.3.27) (pib:11)
[26 Aug 15:48] Bugs System
Pushed into 5.1.37-ndb-6.2.19 (revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4)
(version source revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (merge vers:
5.1.37-ndb-6.2.19) (pib:11)
[27 Aug 18:32] Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:magnus.blaudd@sun.com-20090827163030-6o3kk6r2oua159hr)
(version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers:
5.1.37-ndb-7.0.8) (pib:11)