Bug #75534 | Solve buffer pool mutex contention by splitting it | ||
---|---|---|---|
Submitted: | 16 Jan 2015 19:28 | Modified: | 27 Feb 2017 13:36 |
Reporter: | Laurynas Biveinis (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S5 (Performance) |
Version: | OS: | Any | |
Assigned to: | CPU Architecture: | Any | |
Tags: | buffer pool, contention, innodb, mutex, scalability |
[16 Jan 2015 19:28]
Laurynas Biveinis
[16 Jan 2015 19:33]
Laurynas Biveinis
Bug 75534 patch for 5.7.5 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: bug75534.patch (application/octet-stream, text), 199.66 KiB.
[16 Jan 2015 19:36]
Laurynas Biveinis
This is the XtraDB buffer pool mutex split patch, included in versions 5.0 to 5.6. This version for 5.7 has been further improved. The patch was originally developed by Yasufumi Kinoshita and later maintained by me. - Removes the buffer pool mutex. Introduces several new list/hash protecting mutexes, and access without any mutex to several variables. There atomic variables or os_rmb/os_wmb is used where deemed appropriate. volatile is not used. The new mutexes are - LRU_list_mutex for the LRU_list; - zip_free mutex for the zip_free arrays; - zip_hash mutex for the zip_hash hash and in_zip_hash flag; - free_list_mutex for the free_list and withdraw list. If desired, withdraw_list_mutex may be easily further split in the future. buf_pool->watch[] and all bpage protection has been moved to page_hash. The variables switched from buffer pool mutex protection to atomic operations and/or os_rmb/os_wmb. Particularly the uses of latter, while I tried to make them correct, might be very debatable. - srv_buf_pool_old_size, srv_buf_pool_size, srv_buf_pool_curr_size, srv_buf_pool_base_size - buf_pool->buddy_stat[i].used - buf_pool->curr_size, n_chunks_new - Reduces critical section length or removes it completely for buf_block_buf_fix_inc/dec calls. - Exploits the fact that freed pages must have no pointers to them from the buffer pool nor from any other thread except for the freeing one to remove redundant locking. The same applies to freshly allocated pages before any pointers to them are published. This however necessitates removing some of the debug checks that scan buffer pool chunks directly, as they don't have a way to freeze such blocks. (buf_block_align) - Related to above, add more consistency asserts to buf_page_set_state. Add some scalability asserts (!mutex_own) too. - buf_buddy_alloc rewritten not to require the buffer pool mutex at the start, which then might be released, and this fact propagated to the caller to make decisions to re-check things. It is now called with mutexes unlocked, and the caller buf_page_init_for_read algorithm has been simplified. All its allocations now happen with mutexes unlocked. - buf_flush_LRU_list_batch uses mutex_enter_nowait to skip over any currently-locked blocks. - Removed some outdated buf0buf.cc comments. Bugs fixed fully or partially, besides the current one: - http://bugs.mysql.com/bug.php?id=64344 fixed buf_page_init_for_read holding mutexes while allocating memory. It also should be easier to fix buf_LRU_free_page now. - http://bugs.mysql.com/bug.php?id=75503 - http://bugs.mysql.com/bug.php?id=75504
[19 Jan 2015 16:23]
MySQL Verification Team
Fully verified.
[22 Jan 2015 5:32]
Laurynas Biveinis
The patch was produced for the 5.7.5 tree with some other small InnoDB fixes applied, most notably one for bug 71411. Thus it might fuzz a bit if applying on clean 5.7.5, but it's orthogonal to those other fixes.
[4 Feb 2015 8:07]
Laurynas Biveinis
Bug 75534 patch for 5.7.5, v2 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: bug75534-2.patch (application/octet-stream, text), 205.82 KiB.
[4 Feb 2015 8:08]
Laurynas Biveinis
Updated patch for 5.7.5. Passes MTR: regular, ASAN, and Valgrind. Changes from the previous submission: - removed a spurious debugging fprintf(stderr); - fixed a debug build assertion reporting a lock order violation in buffer pool resize in the case of multiple instances, added a testcase innodb_buffer_pool_resize_multiple_pools. Details at https://bugs.launchpad.net/percona-server/+bug/1414257. - removed the "fix" of a non-bug 75503. - fixed a typo and a missing dirty page check condition in innodb_buffer_pool_evict_uncompressed, added a testcase innodb_buffer_pool_debug. - Added an old XtraDB regression testcase (https://bugs.launchpad.net/percona-xtradb/+bug/317074) as innodb_zip/innodb-buffer-pool. It might be of limited value now, nevertheless it's here for consideration. - Fixed a Valgrind annotation race condition in buf_LRU_block_free_non_file_page where a frame would be marked as unallocated after putting the block back to the free list and releasing its mutex. Another thread might have allocated the same block meanwhile, then getting its frame declared as unallocated, resulting in spurious Valgrind errors. While at that, do not bother marking the frame as undefined right before marking it as unallocated.
[1 Apr 2016 11:02]
Daniel Price
Posted by developer: Fixed as of the upcoming 5.8.0 release, and here's the changelog entry: To address contention that could occur under some workloads, the buffer pool mutex was removed and replaced by several list and hash protecting mutexes. Also, several buffer pool related variables no longer require buffer pool mutex protection. Thanks to Yasufumi Kinoshita and Laurynas Biveinis for the patch.