Bug #38941 fast mutexes in MySQL 5.1 have mutex contention when calling random()
Submitted: 21 Aug 2008 15:38 Modified: 13 Nov 2008 3:32
Reporter: Mark Callaghan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S5 (Performance)
Version:5.1, 6.0 BZR OS:Any
Assigned to: Davi Arnaut
Tags: contention, mutex, random, with-fast-mutexes
Triage: Triaged: D3 (Medium) / R2 (Low) / E2 (Low)

[21 Aug 2008 15:38] Mark Callaghan
Description:
Fast mutexes in MySQL 5.1 call random() to determine the spin wait time. random() is thread safe and uses one mutex in glibc on Linux. 4 threads each calling random() in a loop on my 8 core server is 40X slower than 4 processes calling it because of mutex contention.

How to repeat:
Read the code:

int my_pthread_fastmutex_lock(my_pthread_fastmutex_t *mp)
{
  int   res;
  uint  i;
  uint  maxdelay= MY_PTHREAD_FASTMUTEX_DELAY;

  for (i= 0; i < mp->spins; i++)
  {
    res= pthread_mutex_trylock(&mp->mutex);

    if (res == 0)
      return 0;

    if (res != EBUSY)
      return res;

    mutex_delay(maxdelay);
    maxdelay += ((double) random() / (double) RAND_MAX) * 
	        MY_PTHREAD_FASTMUTEX_DELAY + 1;
  }

Run my testcase on an SMP server:
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>

void *runit(void *arg) {
  int loops = (int) arg;
  int x;
  int r=0;
  for (x = 0; x < loops; ++x) {
    r += random();
  }
  return (void*) r;
}

int main(int argc, char **argv) {
  int num = atoi(argv[1]);
  int loops = atoi(argv[2]);
  fprintf(stderr, "num %d, loops %d\n", num, loops);
  int x;
  pthread_t threads[1000];
  for (x = 0; x < num; x++) {
    pthread_create(&threads[x], NULL, runit, (void*) loops);
  }
  for (x = 0; x < num; x++) {
    void *ret;
    pthread_join(threads[x], &ret);
  }
  return 0;
}

Suggested fix:
Replace random() with something that does not get a mutex. Check out ut_rnd_interval(). InnoDB uses this to get a random spin wait time for its mutex.
[28 Aug 2008 9:39] Sveta Smirnova
Thank you for the report.

Verified as described. Machine with 2 CPUs, 1 core each. 4 processes run about 17 times faster than 4 threads.

For comparison used next test program:

#include <stdio.h>
#include <stdlib.h>

void *runit(void *arg) {
  int loops = (int) arg;
  int x;
  int r=0;
  for (x = 0; x < loops; ++x) {
    r += random();
  }
  return (void*) r;
}

int main(int argc, char **argv) {
  int num = atoi(argv[1]);
  int loops = atoi(argv[2]);
  fprintf(stderr, "num %d, loops %d\n", num, loops);
  int x;
  pid_t childpid[1000];
  int status;
  for (x = 0; x < num; x++) {
    childpid[x] = fork();
    if (childpid[x] == 0) {
      runit( (void*) loops);
      exit(0);
    }
  }
  for (x = 0; x < num; x++) {
    wait(&status);
    printf("Fork: %d, status: %d\n", x, status);
  }
  return 0;
}
[29 Sep 2008 19:27] Davi Arnaut
Note: there is a document from Intel titled "Using Spin-Loops on Intel .." that gives some highlights on how to properly spin on locks:

http://www.intel.com/cd/ids/developer/asmo-na/eng/17689.htm
[9 Oct 2008 0:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55844

2770 Davi Arnaut	2008-10-08
      Bug#38941: fast mutexes in MySQL 5.1 have mutex contention when calling random()
      
      The problem is that MySQL's 'fast' mutex implementation uses the
      random() routine to determine the spin delay. Unfortunately, the
      routine interface is not thead-safe and some implementations (eg:
      glibc) might use a internal lock to protect the RNG state, causing
      excessive locking contention if lots of threads are spinning on
      a MySQL's 'fast' mutex.
      
      The solution is to use the quite simple Park–Miller random number
      generator with a initial seed set to 1 as the the previously used
      generator wasn't being seeded -- the initial seed is 1 if srandom()
      is not called.
      
      Futhermore, the 'fast' mutex implementation has several shortcomings
      and provides no measurable performance benefit. Since nowadays most
      platforms offer support for adaptive locks, MySQL's 'fast' mutex is
      going to be removed in MySQL 6.0.
[15 Oct 2008 12:55] Sergei Golubchik
see also wl#4601
[15 Oct 2008 22:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56306

2669 Davi Arnaut	2008-10-15
      Bug#38941: fast mutexes in MySQL 5.1 have mutex contention when calling random()
      
      The problem is that MySQL's 'fast' mutex implementation uses the
      random() routine to determine the spin delay. Unfortunately, the
      routine interface is not thead-safe and some implementations (eg:
      glibc) might use a internal lock to protect the RNG state, causing
      excessive locking contention if lots of threads are spinning on
      a MySQL's 'fast' mutex. The code was also misusing the value
      of the RAND_MAX macro, this macro represents the largest value
      that can be returned from the rand() function, not random().
      
      The solution is to use the quite simple Park-Miller random number
      generator. The initial seed set to 1 because the the previously used
      generator wasn't being seeded -- the initial seed is 1 if srandom()
      is not called.
      
      Futhermore, the 'fast' mutex implementation has several shortcomings
      and provides no measurable performance benefit. Therefore, its use is
      not recommended unless it provides directly measurable results.
[15 Oct 2008 22:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56307

2669 Davi Arnaut	2008-10-15
      Bug#38941: fast mutexes in MySQL 5.1 have mutex contention when calling random()
      
      The problem is that MySQL's 'fast' mutex implementation uses the
      random() routine to determine the spin delay. Unfortunately, the
      routine interface is not thead-safe and some implementations (eg:
      glibc) might use a internal lock to protect the RNG state, causing
      excessive locking contention if lots of threads are spinning on
      a MySQL's 'fast' mutex. The code was also misusing the value
      of the RAND_MAX macro, this macro represents the largest value
      that can be returned from the rand() function, not random().
      
      The solution is to use the quite simple Park-Miller random number
      generator. The initial seed is set to 1 because the previously used
      generator wasn't being seeded -- the initial seed is 1 if srandom()
      is not called.
      
      Futhermore, the 'fast' mutex implementation has several shortcomings
      and provides no measurable performance benefit. Therefore, its use is
      not recommended unless it provides directly measurable results.
[15 Oct 2008 22:23] Davi Arnaut
Queued to 5.1-bugteam
[10 Nov 2008 10:51] Bugs System
Pushed into 6.0.8-alpha  (revid:davi.arnaut@sun.com-20081015222100-xhhmvjcaa25xkcyh) (version source revid:davi.arnaut@sun.com-20081016021316-p7etwjgausmhe08d) (pib:5)
[10 Nov 2008 11:35] Bugs System
Pushed into 5.1.30  (revid:davi.arnaut@sun.com-20081015222100-xhhmvjcaa25xkcyh) (version source revid:davi.arnaut@sun.com-20081015222100-xhhmvjcaa25xkcyh) (pib:5)
[11 Nov 2008 16:06] Paul Dubois
The versions are actually 5.1.31, 6.0.9.
[13 Nov 2008 3:32] Paul Dubois
Noted in 5.1.31, 6.0.9 changelogs.

The fast mutex implementation was subject to excessive lock contention.
[19 Jan 2009 11:24] Bugs System
Pushed into 5.1.31-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090119095303-uwwvxiibtr38djii) (version source revid:tomas.ulin@sun.com-20090108105244-8opp3i85jw0uj5ib) (merge vers: 5.1.31-ndb-6.2.17) (pib:6)
[19 Jan 2009 13:02] Bugs System
Pushed into 5.1.31-ndb-6.3.21 (revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (version source revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (merge vers: 5.1.31-ndb-6.3.21) (pib:6)
[19 Jan 2009 16:08] Bugs System
Pushed into 5.1.31-ndb-6.4.1 (revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (version source revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (merge vers: 5.1.31-ndb-6.4.1) (pib:6)