Bug #81376 Single page flushing should be removed
Submitted: 11 May 2016 6:48 Modified: 14 May 12:56
Reporter: Laurynas Biveinis (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:5.7 OS:Any
Assigned to: CPU Architecture:Any
Tags: flushing, innodb, performance

[11 May 2016 6:48] Laurynas Biveinis
Description:
If doublewrite buffer is enabled, there are only eight slots for single-page flushes. When there are no free pages, doublewrite immediately becomes the bottleneck on OLTP RW disk-bound due to a low number of slots and general design:

660 pthread_cond_wait,enter(ib0mutex.h:850),buf_dblwr_write_single_page(ib0mutex.h:850),buf_flush_write_block_low(buf0flu.cc:1096),buf_flush_page(buf0flu.cc:1096),buf_flush_single_page_from_LRU(buf0flu.cc:2217),buf_LRU_get_free_block(buf0lru.cc:1401),...
 
631 pthread_cond_wait,buf_dblwr_write_single_page(buf0dblwr.cc:1213),buf_flush_write_block_low(buf0flu.cc:10 ...

If doublewrite buffer is disabled, the single page flushes show that cleaner threads are not coping with the workload. If the reason for that is hot buffer pool mutexes, the single page flush will only make things worse since it must take the same mutexes for the LRU list scan. Otherwise this is an indication of suboptimal cleaner tuning / heuristics / or maybe I/O is already the bottleneck and issuing more will not help.

How to repeat:
Performance and code analysis. The above reasoning is attempted to explain in a bit more detail at

https://www.percona.com/blog/2016/05/03/mysql-5-7-initial-flushing-analysis-and-why-perfor...
https://www.percona.com/blog/2016/05/05/percona-server-5-7-multi-threaded-lru-flushing/
https://www.percona.com/blog/2016/05/09/percona-server-5-7-parallel-doublewrite/

Suggested fix:
- remove single page flushing and any associated code
- in the situation of empty buffer pool free list, wait for the cleaner threads to produce a free page
- improve cleaner thread design so that the empty free list situation only arises when it's unavoidable (i.e. when flushing is already saturating I/O)
[11 May 2016 14:55] MySQL Verification Team
Hi Laurynas,

I do have some additional questions, that were not covered in the articles that you posted.

What was the hardware that you were using? 24 core? 64 core? 128 core?

What was the mysqld configuration?

Were there any additional plugin settings being used?

What other information might help in reproducing the bug?

Maybe you can update the bug with this additional information.

Thanks in advance.
[11 May 2016 18:36] Mark Callaghan
I prefer this remain. That makes it easier to show how much faster MyRocks is compared to InnoDB for write-heavy workloads.
[12 May 2016 5:49] Laurynas Biveinis
The configuration is the same as in , and you can find server settings there:

innodb_log_file_size=10G
innodb_doublewrite=1
innodb_flush_log_at_trx_commit=1
innodb_buffer_pool_instances=8
innodb_change_buffering=none
innodb_adaptive_hash_index=OFF
innodb_flush_method=O_DIRECT
innodb_flush_neighbors=0
innodb_read_io_threads=8
innodb_write_io_threads=8
innodb_lru_scan_depth=8192
innodb_io_capacity=15000
innodb_io_capacity_max=25000
loose-innodb-page-cleaners=4
table_open_cache_instances=64
table_open_cache=5000
loose-innodb-log_checksum-algorithm=crc32
loose-innodb-checksum-algorithm=strict_crc32
max_connections=50000
skip_name_resolve=ON
loose-performance_schema=ON
loose-performance-schema-instrument='wait/synch/%=ON',

I think it was 48 cores, but I am not too sure, will ask Alexey S to confirm.
[12 May 2016 5:50] Laurynas Biveinis
*as in https://www.percona.com/blog/2016/03/17/percona-server-5-7-performance-improvements/
[12 May 2016 6:12] Alexey Stroganov
re: reproducibility - workload IO bound sysbench/OLTP_RW, symptoms clearly visible  on any type of hardware, 16cores+HT/28cores+HT/etc. There is a metric for single page flushing so it's very easy to see when it starts to affect InnoDB performance.
[12 May 2016 14:20] MySQL Verification Team
MarkC,

Thank you so much for your feedback !!!

I never heard about MyRocks, but your words inspired me to push on this subject. 

We shall be hearing on the subject, yet ....
[12 May 2016 14:21] MySQL Verification Team
Laurynis,

Thank you for your feedback.

Verified.
[16 Jul 2018 8:12] Roel Van de Paar
https://www.percona.com/blog/2018/07/13/on-mysql-and-intel-optane-performance/
[14 May 8:36] Jakub Lopuszanski
Posted by developer:
 
Hello from 2025! This is interesting proposal. I've tried a dirty hack to "just remove it" 
origin/mysql-8.4-8.4.6-jlopusza-remove_single_page_flushing-no-hook
Dimitri has tried it, and it deadlocked after a while:

1. a thread is waiting for free page (actually all of them are waiting for a free page), but is in a middle of an mtr so holds a latch on some page(s). It waits for this free page to be delivered by page cleaners
2. the page cleaner coordinator waits for the previous batch to finish, i.e. for the last lagging page cleaner to finish, before starting next batch (which always starts from LRU-flushing)
3. but one of the page cleaners can't make progress, because it waits inside FLUSH_LIST-flushing to get sx-latch on a page held by... one of the user threads who waits for a free page

An obvious solution is to have dedicated threads for LRU-flushing, which do not have to wait for anyone.
There could be one such thread for each BP instance, constantly trying to ensure the free list for that BP instance has the right length.
(It might want to wait a bit, to batch together more writes in one go, if we use double write buffering, so that they all get written together as a single 2wb batch)
[14 May 12:56] Laurynas Biveinis
Dedicated LRU flusher threads are indeed what we did in Percona Server 5.7 - it also simplifies the thread heuristics, because there is no need to consider both flush list and LRU list flushing needs (which might conflict) in a single thread.

I believe we had the option to disable the single page flushing in 5.6 already, but I can't recall whether we had dedicated LRU flushers back then, or whether we had to address the deadlock you describe in any way.