Bug #62535 os_event_create() has inefficient memory allocation
Submitted: 25 Sep 2011 14:46 Modified: 30 Nov 2011 19:45
Reporter: Domas Mituzas Email Updates:
Status: Verified Impact on me:
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.5, 5.1, etc OS:Linux
Assigned to: Assigned Account CPU Architecture:Any

[25 Sep 2011 14:46] Domas Mituzas
InnoDB dynamically allocates event structures (struct os_event_struct) for rw-locks and mutexes.

As every buffer pool control block includes two rwlocks and one mutex, that means that there are three dynamic allocations per buffer pool page.

This has multiple problems:

a) There's space inefficiency, as single allocation would have 0.01% malloc storage overhead, whereas multiple allocations would have ~17% malloc overhead. This results in ~280MB of waste on 64G buffer pool machine.

b) CPU complexity of allocating 12582912 memory structures is slightly higher than allocating 3 or 1, as well as freeing it up on shutdown.

c) All this malloc overhead ends up being stale, as no reallocations are done during the lifetime of a process, so any sane operating system will swap those pages out. This makes shutdowns much slower, as these allocations have to be paged in serially. 

How to repeat:
run mysql with innodb_buffer_pool_size=64G, analyze it with heap profiler

Suggested fix:
as os_event_create is called in every mutex and rwlock initializer anyway, it can be part of rwlock and mutex structures, rather than a dynamic pointer. then it could be allocated as large batch.
[25 Sep 2011 14:58] Domas Mituzas
Alternative fix, suggested by esteemed expert in MySQL community is not to do os_event_create for some of those rwlocks/mutexes at all.