Bug #59214 increase rate limit for insert buffer prefetch reads
Submitted: 29 Dec 2010 15:45 Modified: 5 May 2011 23:48
Reporter: Mark Callaghan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB Plugin storage engine Severity:S5 (Performance)
Version:5.1,5.5 OS:Any
Assigned to: Inaam Rana CPU Architecture:Any
Tags: buffer, full, innodb, insert

[29 Dec 2010 15:45] Mark Callaghan
Description:
On some workloads when the insert buffer is full it stays full. A full insert buffer hurts performance because:
1) it stops being used to save on secondary index maintenance reads, so it doesn't help
2) it uses half of the buffer pool memory, this is a real problem

Prefetch reads to merge pending changes from the insert buffer use at most 5% of innodb_io_capacity. This needs something better.

How to repeat:
Run the insert benchmark or other IO-bound loads that require a lot of secondary index maintenance. The failure of InnoDB to perform on this workload is one reason that users switch to write-optimized engines.

Suggested fix:
What I did in the Google and Facebook patches is:
1) provide a my.cnf variable for a soft limit on the max size of the insert buffer. This value is the percentage of the buffer pool memory with the range 1 to 50
2) provide a my.cnf variable that is the max percentage of innodb_io_capacity that can be used for prefetch reads when the soft limit has been exceeded

Reasonable values can be used for both of these parameters so that most users don't set them and are able to avoid the full insert buffer problem.
[29 Dec 2010 16:09] Mark Callaghan
I wrote about this at http://www.facebook.com/notes/mysql-at-facebook/an-obscure-performance-problem-with-the-in...
[30 Dec 2010 14:01] Mikhail Izioumtchenko
I suspect that merging the insert buffer page just because we can,
could also hurt performance, when the database is bigger than the buffer
pool. When  we merge it because we can it results in N reads for secondary index pages,this provokes flushing and evictions. Then we end up recreating 
what is essentially the same ibuf page for new insertion, then we merge it
because we can with the same result.
If we leave the page alone we just continue inserting into it which seems
to be less IO intensive. 
Of course when the ibuf is left alone in such a manner it will eventually
fill up with the consequences that Mark describes.
The solution of two extra variables looks OK to me but it would be nice
to eventually consider some other ideas.
[25 Jan 2011 0:16] Mark Callaghan
Another note about the performance benefit of the insert buffer, assuming it is fixed as requested here

http://www.facebook.com/notes/mysql-at-facebook/something-awesome-in-innodb-the-insert-buf...
[2 Feb 2011 15:10] Inaam Rana
Mark,

Do you mean that once ibuf is full and is no longer used for buffering changes it still keeps occupying half of the buffer pool. My understanding is that ibuf pages are treated no differently then regular data pages in the context of buffer pool and if ibuf is not being used then its pages will eventually age out. Have you seen something in your benchmarks that points to possible pinning of ibuf pages in the buffer pool?
[2 Feb 2011 16:32] Mark Callaghan
To reproduce this run the insert benchmark on unmodified MySQL. I have done this many times myself. Once the ibuf is full performance drops off of a cliff.

This is fixed in the Google, Facebook and Percona versions of InnoDB. The fix is to increase the rate at which pages are flushed by srv_master_thread.

All that I know about the size is what SHOW INNODB STATUS displays. Is that the size of the on-disk tree? The number of pages in use for the buffer pool? After discussing this I assume it is the size of the on-disk structure. Regardless, performance goes bad when it gets too big whether or not those pages are in the buffer pool.

When the "size" value below gets large (50% of buffer pool pages) then performance falls off of a cliff.

-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 20823, free list len 15955, seg size 36779,

The low rate limit at which flushes are done by srv_master_thread (5% of innodb_io_capacity) are the problem.

This code in ibuf_insert_low is where it decides to stop being useful:

        if (ibuf->size >= ibuf->max_size + IBUF_CONTRACT_DO_NOT_INSERT) {
                /* Insert buffer is now too big, contract it but do not try
                to insert */

                mutex_exit(&ibuf_mutex);

#ifdef UNIV_IBUF_DEBUG
                fputs("Ibuf too big\n", stderr);
#endif
                /* Use synchronous contract (== TRUE) */
                ibuf_contract(TRUE);

                return(DB_STRONG_FAIL);
        }
[2 Feb 2011 21:02] James Day
Inaam, not pinned, just stale, I think.

Context is a workload that is continually changing pages and adding insert buffer entries at a high rate. That causes insert buffer to expand to maximum size and the insert buffer merge rate isn't fast enough to get merges done to keep the size under control.

Think of innodb_max_dirty_pages_pct and how it increases the work rate when the dirty page count gets large. Mark is effectively requesting the same sort of thing for the insert buffer, so the merge rate goes up when it gets near to its maximum size.

I haven't looked at the other solutions Mark referred to, they might have something neater than just a % full measure.

Normally the merge rate is too high for the servers I see and I want to make it lower, so we don't want to make it much higher all the time. Just when we know it's a stress point because it has approached the size cap.
[7 Feb 2011 20:46] Inaam Rana
James, Mark,

I agree that no change buffering can hurt performance, particularly in the context of an insert benchmark. What I was trying to clarify was that of the two reasons mentioned by Mark the second one "that ibuf occupies half of the buffer pool" is perhaps not accurate over a period of time as ibuf pages will age out of the buffer pool.

Based on Mark's suggestions I have a simple solution for this in mind.

innodb_change_buffer_max_size = 1 - 50 (default 25)

for prefetch ratio:

if (ibuf::size < ibuf::max_size / 2)
  use PCT_IO(5) //same as current and good enough for most of the workloads
else
  use PCT_IO(25) to PCT_IO(50) depending on how much above the LWM of ibuf::max_size/2 we are. I use PCT_IO(50) because typically an ibuf merge will involve not only a read of the page to merge but also a write once the merge takes place.

Sounds reasonable? It has the advantage of exposing only one knob to the user meaning lesser things to worry about. The disadvantage is that user/DBA can't control the prefetch rate i.e.: if a user wants to keep prefetch rate low even in midst of high ibuf activity he can't do that. But then why would someone like to do that given that having ibuf space available for buffering is always a winner when compared with plain inserts.
[9 Feb 2011 4:17] Mark Callaghan
I wasn't aware that ibuf pages could be paged out like any other pages.

Having 2 rates would be a big win over the current behavior.
[10 Feb 2011 15:04] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/131047

3495 Inaam Rana	2011-02-10
      Bug #59214 :increase rate limit for insert buffer prefetch reads
      
      rb://593
      approved by: Marko
      
      1) Make ibuf size configurable. Current behavior is to have a maximum
      on-disk size equal to half the buffer pool size. This patch introduces
      a new parameter:
      
      innodb_change_buffer_max_size = default 25, min 0, max 50. Value means
      %age of buffer pool size. Dynamic.
      
      2) Make ibuf merge more aggressive:
      * upto ibuf::size < ibuf::max_size / 2, merge at PCT_IO(5) same as
        current. And should be good for most of the workloads.
      * when ibuf::size > ibuf::max_size / 2, merge at PCT_IO(5) +
        PCT_IO(excess difference in % of max_size)
[5 May 2011 23:48] John Russell
Added to 5.6.2 changelog:

This fix introduces a new configuration option,
innodb_change_buffer_max_size, which defines the size of the change
buffer as a percentage of the size of the buffer pool. Because the
change buffer shares memory space with the buffer pool, a workload
with a high rate of DML operations could cause pages accessed by
queries to age out of the buffer pool sooner than desirable. This fix
also devotes more I/O capacity to flushing entries from the change
buffer when it exceeds 1/2 of its maximum size.