Bug #18054 Severe performance degradation when Hyperthreading is on
Submitted: 8 Mar 2006 2:04 Modified: 16 Jan 2007 16:08
Reporter: Tetsuro Ikeda Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:5.0.18 OS:Miracle Linux ver4 (kernel 2.6)
Assigned to: Heikki Tuuri CPU Architecture:Any

[8 Mar 2006 2:04] Tetsuro Ikeda
Description:
Hi !

This report is originally from IPA. You may want to know about us from here.
- http://www.ipa.go.jp/software/open/forum/index-e.html
- http://www.ipa.go.jp/software/open/forum/north_asia/wg1-e.html

While we evaluated performance of MySQL and PostgreSQL, we faced severe performance degradation of InnoDB when Hyperthreading is ON.

The benchmark suite we used is here.
- http://www.ipa.go.jp/software/open/forum/development/download/051115/dbt1-v2.1-MySQL-ODBC-...

This may be fatal point for MySQL/InnoDB. In our performance test, PostgreSQL got good scalability when Hyperthreading is ON. This issue may be same for dual core CPU.

Here is a detail of this issue.
---

We found a severe performance degradation when Hyperthreading is on
and thread_concurrency=20.

We are using OSDL DBT-1 as the benchmark and got about 200 to 250
BT (bogotransactions per second) HT is OFF normal case but 30 to
50 BT on HT is ON.

HT/OFF DBT-1 EU=1900 BT=258.9
HT/ON  DBT-1 EU=1900 BT=55.6

innodb_thread_concurrency=20

So we did profile (using oprofile tool) and got the following profiling
data. My impression is that mutex_spin_wait (and ut_delay) is
something wrong if HT is ON. (Spin-wait loop is too expensive if it is
hyperthreading.)

I added the following code but it does not help it.

$ diff -pu ut0ut.c.orig ut0ut.c
--- ut0ut.c.orig        2005-10-17 10:27:43.000000000 +0900
+++ ut0ut.c     2006-02-28 11:59:16.777840496 +0900
@@ -290,6 +290,13 @@ ut_delay(
        j = 0;

        for (i = 0; i < delay * 50; i++) {
+               /* When executing a spin-wait loop on the Hyper-Threading
+                  processor, the processor can suffer a severe performance
+                   penalty. The pause instruction provides a hint to the
+                   processor. Please refer IA-32 Intel Architecture
+                   Software Developers Manual, Vol 3.                   */
+               __asm__ __volatile__(
+               "pause; \n");
                j += i;
        }

What do you think? Is there any hints?

HT is OFF
CPU: P4 / Xeon, speed 2793.26 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not
stopped) with a unit mask of 0x01 (mandatory)
count 100000
samples  %        image name               app name                 symbol
name
13159082  8.8445  libc-2.3.4.so            libc-2.3.4.so            memcpy
12565549  8.4456  libpthread-2.3.4.so      libpthread-2.3.4.so
pthread_mutex_trylock
11387363  7.6537  mysqld                   mysqld
rec_get_offsets_func
9631916   6.4738  libpthread-2.3.4.so      libpthread-2.3.4.so
pthread_mutex_unlock
8794484   5.9110  mysqld                   mysqld
btr_search_guess_on_hash
4949248   3.3265  mysqld                   mysqld
row_search_for_mysql
4022481   2.7036  mysqld                   mysqld                   ut_delay
3754265   2.5233  mysqld                   mysqld
cmp_dtuple_rec_with_match
2535190   1.7040  mysqld                   mysqld
row_sel_store_mysql_rec
2520957   1.6944  mysqld                   mysqld
btr_cur_search_to_nth_level

HT is ON
CPU: P4 / Xeon with 2 hyper-threads, speed 2793.26 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not
stopped) with a unit mask of 0x01 (mandatory)
count 100000
samples  %        image name               app name                 symbol
name
53221317 21.4225  libpthread-2.3.4.so      libpthread-2.3.4.so
pthread_mutex_lock
25743323 10.3621  mysqld                   mysqld                   ut_delay
12345146  4.9691  vmlinux                  vmlinux                  do_futex
12066038  4.8568  mysqld                   mysqld
mutex_spin_wait
10395391  4.1843  vmlinux                  vmlinux
LKST_ETYPE_PROCESS_SCHED_ENTER_HEADER_hook
9247281   3.7222  libpthread-2.3.4.so      libpthread-2.3.4.so
pthread_mutex_unlock
7407229   2.9815  vmlinux                  vmlinux
futex_requeue
5921454   2.3835  libpthread-2.3.4.so      libpthread-2.3.4.so
pthread_mutex_trylock
5484279   2.2075  vmlinux                  vmlinux
LKST_ETYPE_PROCESS_WAKEUP_HEADER_hook
4846067   1.9506  vmlinux                  vmlinux
__switch_to

How to repeat:
Run DBT-1, or any benchmark progorams which will use the function "ut_delay", innobase/ut/ut0ut.c.

You can get DBT-1 from here.
http://www.ipa.go.jp/software/open/forum/development/download/051115/dbt1-v2.1-MySQL-ODBC-...

Suggested fix:
The function ut_delay may need to upgrade.
[8 Mar 2006 4:09] Vadim Tkachenko
Tetsuro,

Could you try with HT/ON
and innodb_sync_spin_loops=0 or 1 or 2?

Thanks,
Vadim.
[8 Mar 2006 10:35] Heikki Tuuri
Tetsuro,

this is probably a duplicate of http://bugs.mysql.com/bug.php?id=15815

MySQL-5.1.8 should contain Osku's improvements to sync0arr.c, and the performance problem would be less severe there.

Regards,

Heikki
[8 Apr 2006 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[16 Jan 2007 16:08] Heikki Tuuri
This may be fixed in 5.0.30. See bugs #15815 and #22868.