Bug #27501 5.0 significantly more sys ("kernel") time than 4.1 due to getrusage() calls
Submitted: 28 Mar 2007 15:33 Modified: 25 Aug 2007 16:15
Reporter: Nils Goroll Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:5.0.37 OS:Solaris (Solaris 9)
Assigned to: Chad MILLER CPU Architecture:Any
Tags: getrusage, performance, profiling

[28 Mar 2007 15:33] Nils Goroll
Description:
We have migrated a database with quite a high throughput
(6000..12000 questions/sec) from mySQL 4.1 to MySQL 5
on a Sun Fire V480 16GB / 4x US-III 1.2GHz).

With MySQL 5 running, we observed significantly higher system (sys)
time usage ("kernel time") than with MySQL 4.1, so the system was
over-saturated at roughly one third (1/3) of its usual load.

We tracked the issue down to a high amount of getrusage() calls.
here's a truss -c snapshot:

syscall               seconds   calls  errors
read                    3.198    6850    402
write                   2.395    3755
open                     .022      41
close                    .247     239
time                    3.983    9773
lseek                    .033      67     17
getpid                   .128     290
alarm                    .018      31
access                  1.170    1599      8
fdsync                   .012      10
fcntl                    .311    1102
lwp_park                3.723    9769
lwp_unpark              4.075    9494
poll                    1.476    1121
readlink                 .000       8      8
sigprocmask             1.000    3552
sigtimedwait             .021      31
yield                    .115      79
lwp_kill                 .016      31
lwp_mutex_wakeup         .000       1
lwp_mutex_lock           .000       1
pread                    .014      20
pwrite                   .021      18
getrusage              32.029    9588		<<<===
resolvepath              .004      16      8
accept                   .070      50
shutdown                 .173     215
getpeername              .015      72
getsockname              .016      50
setsockopt               .116     171
                     --------  ------   ----
sys totals:            54.413   58044    443
usr time:               2.240
elapsed:               35.500

Though this issue could also be Solaris 9 spefic, we'd guess that
this issue could be related to the new profiling features.

Profiling is off:

mysql> show variables like "prof%";
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| profiling              | OFF   |
| profiling_history_size | 15    |
+------------------------+-------+
2 rows in set (0.01 sec)

We have worked around this issue by LD_PRELOADing a very simple
wrapper which simply avoids the getrusage syscall
(int getrusage(...) { return 0; } and resource consumption was
immediately back to what we know from MySQL4.1.

Our suggestion / RFE is to eleminate getrusage() calls wherever
possible if profiling is off.

How to repeat:
compare mysql 4.1 performance with 5.0 on a machine with a high
amount of questions/sec (approx 10.000 qps)

Suggested fix:
Eleminate getrusage() calls wherever possible if profiling is off.
[28 Mar 2007 15:47] Nils Goroll
Edited Synopsis: "slower" was not an adequate description of the issue
[29 Mar 2007 1:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/23226

ChangeSet@1.2423, 2007-03-28 21:21:34-04:00, cmiller@zippy.cornsilk.net +2 -0
  Bug#27501: 5.0 significantly more sys ("kernel") time than 4.1 \
  	due to getrusage() calls
  
  Even if profiling is turned off, the parser makes calls to reset 
  the state at the beginning of each query.  That would eventually 
  instantiate a PROFILE_ENTRY, which does indeed capture resource 
  usage.
  
  Instead, now check that profiling is active before progressing
  far into the storage/expiration of old entries in the history.
  This has the pleasant side-effect that queries to toggle profiling
  are not recorded in the history.
[29 Mar 2007 7:31] Nils Goroll
I am seriously impressed by the speed this bug report has been handled.
We will wait for the next version and try it then.

THANK YOU !
[9 Aug 2007 11:44] Chad MILLER
Oops.  This was pushed to a 5.0-community release long ago.  

Nothing to document.
[9 Aug 2007 12:32] Chad MILLER
That was sent to the release that was cloned around 2007-04-03, which I calculate must have been v5.0.41.
[25 Aug 2007 16:15] Paul DuBois
Noted in 5.0.41 changelog.

Profiling overhead was incurred even with profiling disabled.