Bug #70500 | Page cleaner should perform LRU flushing regardless of server activity | ||
---|---|---|---|
Submitted: | 3 Oct 2013 6:29 | Modified: | 12 Sep 2018 12:22 |
Reporter: | Laurynas Biveinis (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S3 (Non-critical) |
Version: | 5.6.23 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | cleaner, Contribution, flushing |
[3 Oct 2013 6:29]
Laurynas Biveinis
[11 Oct 2013 10:17]
Sveta Smirnova
Thank you for the report. Please provide example of instability which current behavior causes.
[11 Nov 2013 13:47]
James Day
Related bug with some description of how this breaks things and blog discussing it at Bug #68481. James Day, MySQL Senior Principal Support Engineer, Oracle
[12 Nov 2013 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[12 Nov 2013 8:46]
Laurynas Biveinis
I have asked Alexey Stroganov to provide feedback on this bug. Until he does, maybe the following would be of some (small) help. If the analysis is correct, then purge on an otherwise idle server that needs to access cold data should work purely by single page flushes. Thus I'd try the following: 1) Setup a smallish buffer pool and monitor buffer_LRU_single_flush_* in INNODB_METRICS 2) Open a transaction to force history length growth 3) Run any RW workload, Sysbench should be fine, for long enough 4) Force the buffer pool contents out by running a RO workload on another table or similar 5) Stop all the workload, close the open transaction and, if the analysis is correct, single page LRU flush counters should grow as purge progresses.
[14 Jan 2014 19:05]
Inaam Rana
I think (not sure though) I have seen this recently. Make a cold start. Run a check table on large enough data set that does not fit in the memory. check table IIRC won't increment the activity count and once we run out of pages single page flushing will kick in. I believe suggested solution to decouple LRU flushing from activity count is a good idea. Having said this, I don't think this should be an issue with sysbench as any DML should increment activity count. Also even when somehow the count is not incremented the page_cleaner ends up doing a full capacity flush list batch. This should help keep number of dirty pages fairly low if not zero. What this means is that single page flushing will likely end up just evicting a clean page from LRU and not doing actual flushing.
[27 Jan 2014 16:48]
Inaam Rana
Simple patch to do LRU flushing regardless of server activity
Attachment: 305_v1.patch (application/octet-stream, text), 1.42 KiB.
[10 Mar 2014 12:52]
Laurynas Biveinis
Related bug 71988.
[4 Nov 2014 6:34]
Laurynas Biveinis
Related bug 74637
[4 Nov 2014 11:30]
Laurynas Biveinis
Bug 70500 5.7 fix attempt (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: bug70500.patch (application/octet-stream, text), 4.89 KiB.
[4 Nov 2014 11:35]
Laurynas Biveinis
I have uploaded a 5.7 fix attempt for this. This patch is more for discussion than anything else. I have tested that it does not crash the server, but nothing else. Some thoughts: - Trying to fix this for 5.7 shows that the current multithreaded flushing design is not particularly good for splitting LRU and flush list flushing heuristics. One cannot queue LRU flush requests and later queue flush list flushing requests on the top. First the LRU requests are issued, waited to complete, then the flush list flushing requests are issued. This serialisation is needless, but not that easy to avoid with the current code. The proper fix would be to fix bug 74637 to split LRU/flush list flushing completely. - It would be great if the public testsuite contained at least one MTR test that runs with multiple page cleaners. I tested this using --mysqld=--innodb-page-cleaners=X --mysqld=--innodb-buffer-pool-instances=Y --mysqld=--innodb-buffer-pool-size=1G args, but the testsuite run is not fully ready for these options
[4 Nov 2014 11:36]
Laurynas Biveinis
The patch was made on the top of my contributed patch for bug 71411.
[5 Dec 2014 20:09]
Sveta Smirnova
Thank you for the feedback and contributions. Verified as described using CHECK TABLE test case.
[12 Feb 2018 15:34]
Laurynas Biveinis
MT LRU flusher, main patch for 8.0.4
Attachment: bug70500-8.0.4-1.patch (application/octet-stream, text), 37.15 KiB.
[12 Feb 2018 15:35]
Laurynas Biveinis
MT LRU flusher for 8.0.4, testsuite updates
Attachment: bug70500-8.0.4-2.patch (application/octet-stream, text), 2.10 KiB.
[15 Mar 2018 4:46]
Laurynas Biveinis
Bug 70500 fix for 8.0.4, missed shutdown synchronization bit
Attachment: bug70500-8.0.4-3.patch (application/octet-stream, text), 990 bytes.
[13 Jun 2018 12:45]
Laurynas Biveinis
MT LRU flusher for 8.0.11
Attachment: bug70500-8.0.11.patch (application/octet-stream, text), 46.52 KiB.
[16 Jul 2018 8:13]
Roel Van de Paar
https://www.percona.com/blog/2018/07/13/on-mysql-and-intel-optane-performance/
[12 Sep 2018 12:22]
Laurynas Biveinis
The contributed patch has a bad merge error in that it only starts a single LRU thread regardless of instance count. The tail of buf_flush_page_init should look along the lines of (not tested): for (decltype(srv_buf_pool_instances) i = 0; i < srv_buf_pool_instances; i++) os_thread_create(buf_lru_manager_thread_key, buf_lru_manager_thread); /* Make sure page cleaner and LRU managers are active. */ while (!buf_page_cleaner_is_active || buf_lru_manager_running_threads.load(std::memory_order_relaxed) < srv_buf_pool_instances) { os_thread_sleep(10000); }