Description:
Hi !
This report is originally from IPA. You may want to know about us from here.
- http://www.ipa.go.jp/software/open/forum/index-e.html
- http://www.ipa.go.jp/software/open/forum/north_asia/wg1-e.html
While we evaluated performance of MySQL and PostgreSQL, we faced severe performance degradation of InnoDB when Hyperthreading is ON.
The benchmark suite we used is here.
- http://www.ipa.go.jp/software/open/forum/development/download/051115/dbt1-v2.1-MySQL-ODBC-...
This may be fatal point for MySQL/InnoDB. In our performance test, PostgreSQL got good scalability when Hyperthreading is ON. This issue may be same for dual core CPU.
Here is a detail of this issue.
---
We found a severe performance degradation when Hyperthreading is on
and thread_concurrency=20.
We are using OSDL DBT-1 as the benchmark and got about 200 to 250
BT (bogotransactions per second) HT is OFF normal case but 30 to
50 BT on HT is ON.
HT/OFF DBT-1 EU=1900 BT=258.9
HT/ON DBT-1 EU=1900 BT=55.6
innodb_thread_concurrency=20
So we did profile (using oprofile tool) and got the following profiling
data. My impression is that mutex_spin_wait (and ut_delay) is
something wrong if HT is ON. (Spin-wait loop is too expensive if it is
hyperthreading.)
I added the following code but it does not help it.
$ diff -pu ut0ut.c.orig ut0ut.c
--- ut0ut.c.orig 2005-10-17 10:27:43.000000000 +0900
+++ ut0ut.c 2006-02-28 11:59:16.777840496 +0900
@@ -290,6 +290,13 @@ ut_delay(
j = 0;
for (i = 0; i < delay * 50; i++) {
+ /* When executing a spin-wait loop on the Hyper-Threading
+ processor, the processor can suffer a severe performance
+ penalty. The pause instruction provides a hint to the
+ processor. Please refer IA-32 Intel Architecture
+ Software Developers Manual, Vol 3. */
+ __asm__ __volatile__(
+ "pause; \n");
j += i;
}
What do you think? Is there any hints?
HT is OFF
CPU: P4 / Xeon, speed 2793.26 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not
stopped) with a unit mask of 0x01 (mandatory)
count 100000
samples % image name app name symbol
name
13159082 8.8445 libc-2.3.4.so libc-2.3.4.so memcpy
12565549 8.4456 libpthread-2.3.4.so libpthread-2.3.4.so
pthread_mutex_trylock
11387363 7.6537 mysqld mysqld
rec_get_offsets_func
9631916 6.4738 libpthread-2.3.4.so libpthread-2.3.4.so
pthread_mutex_unlock
8794484 5.9110 mysqld mysqld
btr_search_guess_on_hash
4949248 3.3265 mysqld mysqld
row_search_for_mysql
4022481 2.7036 mysqld mysqld ut_delay
3754265 2.5233 mysqld mysqld
cmp_dtuple_rec_with_match
2535190 1.7040 mysqld mysqld
row_sel_store_mysql_rec
2520957 1.6944 mysqld mysqld
btr_cur_search_to_nth_level
HT is ON
CPU: P4 / Xeon with 2 hyper-threads, speed 2793.26 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not
stopped) with a unit mask of 0x01 (mandatory)
count 100000
samples % image name app name symbol
name
53221317 21.4225 libpthread-2.3.4.so libpthread-2.3.4.so
pthread_mutex_lock
25743323 10.3621 mysqld mysqld ut_delay
12345146 4.9691 vmlinux vmlinux do_futex
12066038 4.8568 mysqld mysqld
mutex_spin_wait
10395391 4.1843 vmlinux vmlinux
LKST_ETYPE_PROCESS_SCHED_ENTER_HEADER_hook
9247281 3.7222 libpthread-2.3.4.so libpthread-2.3.4.so
pthread_mutex_unlock
7407229 2.9815 vmlinux vmlinux
futex_requeue
5921454 2.3835 libpthread-2.3.4.so libpthread-2.3.4.so
pthread_mutex_trylock
5484279 2.2075 vmlinux vmlinux
LKST_ETYPE_PROCESS_WAKEUP_HEADER_hook
4846067 1.9506 vmlinux vmlinux
__switch_to
How to repeat:
Run DBT-1, or any benchmark progorams which will use the function "ut_delay", innobase/ut/ut0ut.c.
You can get DBT-1 from here.
http://www.ipa.go.jp/software/open/forum/development/download/051115/dbt1-v2.1-MySQL-ODBC-...
Suggested fix:
The function ut_delay may need to upgrade.