Bug #62534 off by one error in innodb_max_dirty_pages_pct logic
Submitted: 25 Sep 2011 12:46 Modified: 13 Mar 2014 14:27
Reporter: Domas Mituzas Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.1-head, 5.5-head OS:Any
Assigned to: Marko Mäkelä CPU Architecture:Any
Tags: Contribution
Triage: Needs Triage: D3 (Medium)

[25 Sep 2011 12:46] Domas Mituzas
Description:
buf_get_modified_ratio_pct() returns int
srv_max_buf_pool_modified_pct is unsigned int

current max dirty pages enforcement logic is:

if (buf_get_modified_ratio_pct() > srv_max_buf_pool_modified_pct) {

that means that buf_get_modified_ratio_pct() has to be at least 1% for page flushing to kick in, so it is not possible to drain large buffer pools using innodb_max_dirty_pages_pct=0

1% at 200G buffer pool is 2G of dirty pages, which can often be beyond log capacity ... ;-)

How to repeat:
set  innodb_max_dirty_pages_pct=0, observe 1% of pages to stay dirty

Suggested fix:
--- storage/innobase/srv/srv0srv.c	2011-07-19 14:54:59 +0000
+++ storage/innobase/srv/srv0srv.c	2011-09-25 12:45:09 +0000
@@ -2780,7 +2780,7 @@
 		}
 
 		if (UNIV_UNLIKELY(buf_get_modified_ratio_pct()
-				  > srv_max_buf_pool_modified_pct)) {
+				  >= srv_max_buf_pool_modified_pct)) {
 
 			/* Try to keep the number of modified pages in the
 			buffer pool under the limit wished by the user */
@@ -3006,7 +3006,7 @@
 
 	log_checkpoint(TRUE, FALSE);
 
-	if (buf_get_modified_ratio_pct() > srv_max_buf_pool_modified_pct) {
+	if (buf_get_modified_ratio_pct() >= srv_max_buf_pool_modified_pct) {
 
 		/* Try to keep the number of modified pages in the
 		buffer pool under the limit wished by the user */
[25 Sep 2011 17:18] Valeriy Kravchuk
Thank you for the problem report and code contributed.

IBM Informix ended up with decimal values for similar variable (LRU_MAX_DITRY), so they can start flushing pages from their "buffer pool" when, say, 0.1% of them are dirty...
[25 Sep 2011 21:58] James Day
There's a merit to having at least a single decimal place in this calculation. I've had to choose between low single digit values of innodb_max_dirty_pages_pct for customer tuning above 0 and wasn't able to get as close as I wanted to to my target. Maybe values of 1000 and higher => (value - 1000) / 100 so 1001 => 0.1% dirty. That's beyond the minimal fix for this bug in 5.1 though.
[26 Sep 2011 10:28] Domas Mituzas
for what its worth, it can be a float ;-)
[27 Sep 2011 0:15] Mark Callaghan
patch for MySQL 5.1.52 and the FB patch

Attachment: b62534.patch (application/octet-stream, text), 18.55 KiB.

[27 Sep 2011 0:16] Mark Callaghan
I attached a patch that applies clean to MySQL 5.1.52 after the FB patch for mysql has been applied. Anyone want to review it for me?
[13 Mar 2014 14:27] Daniel Price
Fixed as of 5.7.5, and here's the changelog entry:

Setting "innodb_max_dirty_pages_pct=0" would leave 1% of dirty pages
unflushed. Buffer pool flushing is initiated when the percentage of dirty
pages is greater "innodb_max_dirty_pages_pct". The internal variables that
store the "innodb_max_dirty_pages_pct" value and the percentage of dirty
pages ("buf_get_modified_ratio_pct" and "srv_max_buf_pool_modified_pct")
were defined as unsigned integer data types, which meant that a
"innodb_max_dirty_pages_pct" value of 0 required a dirty pages percentage
of 1 or greater to initiate buffer pool flushing. 

To address this problem, the "buf_get_modified_ratio_pct" and
"srv_max_buf_pool_modified_pct" internal variables are redefined as double
data types, which changes the range value for "innodb_max_dirty_pages_pct"
and "innodb_max_dirty_pages_pct_lwm" from "0 .. 99" to "0 .. 99.99".
Additionally, buffer pool flushing is now initiated when the percentage of
dirty pages is "greater than or equal to" "innodb_max_dirty_pages_pct".

Thank you for the bug report.
[2 Apr 2014 13:37] Daniel Price
Posted by developer:
 
In addition to 5.7.5, this bug is now fixed 5.6.18. Changelog entry is now documented in both the 5.6.18 and 5.7.5 release notes.
[2 Jun 2014 14:03] Laurynas Biveinis
$ bzr log -r 5878
------------------------------------------------------------
revno: 5878
committer: Aditya A <aditya.a@oracle.com>
branch nick: mysql-5.6
timestamp: Wed 2014-04-02 10:50:50 +0530
message:
  Bug #13029450	OFF BY ONE ERROR IN INNODB_MAX_DIRTY_PAGES_PCT
  		LOGIC
  
  If the percentage of dirty pages in the buffer pool
  exceeds innodb_max_dirty_pages_pct (set by the user)
  then we flush the pages.If user sets 
  innodb_max_dirty_pages_pct=0,then the flushing mechanism
  will not kick in unless the percentage of dirty pages 
  reaches at least 1%.For huge buffer pools even 1% of the
  buffer pool can be a huge number.
  
  FIX
  ---
  
  Flush the dirty pages in buffer pool if percentage of dirty 
  pages is greater than zero and innodb_max_dirty_pages_pct
  is set to zero.
  
  [Approved by vasil #rb4776]
[2 Jun 2014 14:50] Laurynas Biveinis
The 5.6 fix description is wrong in RNs. The variables were not changed, they are still unsigned integers in 5.6. It only changed that setting innodb_max_dirty_pages_pct=0 works as expected, nothing else. Bug 72837.
[2 Jun 2014 18:25] Daniel Price
The 5.6.18, 5.7.5 changelog entry has been revised as follows:

With "innodb_max_dirty_pages_pct=0" buffer pool flushing would not be
initiated until the percentage of dirty pages reached at least 1%, which
would leave up to 1% of dirty pages unflushed.
[2 Jun 2014 18:26] Daniel Price
Disregard previous post. It was intended for a different bug. Thank you.
[2 Jun 2014 18:28] Daniel Price
Please disregard previous post.

The 5.6.18, 5.7.5 changelog entry has been revised as follows:

With "innodb_max_dirty_pages_pct=0" buffer pool flushing would not be
initiated until the percentage of dirty pages reached at least 1%, which
would leave up to 1% of dirty pages unflushed.