MySQL Bugs: #74637: make dirty page flushing more adaptive

Bug #74637	make dirty page flushing more adaptive
Submitted:	30 Oct 2014 14:39	Modified:	30 Oct 2014 16:54
Reporter:	Inaam Rana (OCA)	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S5 (Performance)
Version:	5.6	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
Currently how flushing works is that page_cleaner thread wakes up every second and:
* flush (or evict) pages from LRU list based on innodb_lru_scan_depth
* flush pages from flush_list (dirty page list) based on
* if we are constrained due to redo space (capped by innodb_io_capacity)
* if we are nearing or have crossed max_dirty_page_percent (capped by innodb_io_capacity)
One thing that we can improve on is to make our decision on which dirty pages to flush more smart. As mentioned above currently the flushing that happens to limit number of dirty pages is done from flush_list where pages are ordered on LSN. This is fine in most cases as it allows us to move oldest_LSN further making checkpoints smooth.
But if the system is not constrained by redo log space and dirty page percent is high it will make more sense to do dirty page flushing from LRU tail as well. This can help avoid single page flushing.
To clarify take example of a system that is under no redo space pressure and dirty page percent has already been crossed. Assume that the system is perfectly tuned to provide page_cleaner with exactly 1 second worth of work every second. For example, assume that we are flushing/evicting 8K pages from LRU (1024 innodb_lru_scan_depth and 8 buffer pool instances i.e.: the default values) and we are flushing 10K pages from flush_list (innodb_io_capacity = 10000). Assume it takes 0.4 seconds to do LRU flushing and 0.6 seconds to do dirty page flushing. If we are reading in pages, say, at a rate of 15000 per second then LRU flushing will provide us 8K free pages but for the rest of the 0.6 seconds the page_cleaner will be flushing from flush_list which is in LSN order and the threads trying to read in pages will have to resort to single page flushing.

How to repeat:
see code

Suggested fix:
Try to make dirty page flushing more adaptive. If we are under LRU pressure then flush from LRU. If we are under redo space pressure then try to flush from flush_list.

This is a feature request that would improve performance in dirty pages flushing.

I find it smart and it is almost cost-based optimization.

For LRU_list flush , I think Percona Server has a more smart strategy： separated LRU thread and the frequency of  flushing LRU List  is based on length of free list.

For FLUSH_list, I think it's better to take current redo age into consideration while deciding  the sleep time of page cleaner. For example, a simple function I added to calculate the sleep time:

static
ulint
page_cleaner_adapt_flush_sleep_time(void)
/*=====================================*/
{
        lint sleep_time = 1000;

        if(srv_pc_adaptive_sleep) {
                lint pct = 0;
                lsn_t age = log_get_lsn() - buf_pool_get_oldest_modification();
                if (age > log_sys->max_modified_age_sync/2) {
                        if (age > log_sys->max_modified_age_sync)
                                sleep_time = 0;
                        else {
                                pct = (age * 100)/log_sys->max_modified_age_sync;
                                sleep_time = 1000 - (pct * pct)/(10 * srv_pc_sleep_factor);

                                if (sleep_time < 0)
                                        sleep_time = 0;
                        }
                }
        }

        return sleep_time;
}

For the record I fully agree with Zhai :)

Page cleaner has no business mixing LRU and flush list flush heuristics, and their serialisation there is artificial. This not solved by the current 5.7 MT flushing neither. As for the single-page flushes, remove their code altogether. If there are no free pages, wait until cleaner produces some (which it will do rather quickly if it has proper heuristics and is not hindered by the query threads attempting single page flushes).

Related bug 70500

Proposed patch to address the issues pointed out by Inaam + moving LRU flush into a separate thread.

Attachment: flush.patch (text/x-patch), 19.66 KiB.

The patch above attempts to correct the dirty page flushing schedule and adds several innodb_metrics variables to keep track of its work. A basic test case included.

Sasha, your patch appears not to move LRU flushing to a separate thread, but rather create a separate LRU flushing thread *and* make the page cleaner thread to do LRU flushing too if no free log space pressure. I assume this is on purpose, do you know how much better this is rather than separate thread LRU flushing only?

Other comments:
- The LRU thread will perform LRU flushing only on "active" server. Have you checked bug 70500?
- If LRU thread works during srv_shutdown_state == SRV_SHUTDOWN_CLEANUP too, it will provide free pages for purge etc. during shutdown too, we found this to have beneficial effect.
- If the LRU thread is created regardless of srv_read_only_mode, it should work for eviction of old pages in I/O-bound RO workloads efficiently too.
- buf_flush_lru_is_active variable is write-only.