| Bug #61341 | buf_LRU_insert_zip_clean can be O(N) on LRU length | ||
|---|---|---|---|
| Submitted: | 28 May 2011 15:37 | Modified: | 5 Jan 2012 20:15 | 
| Reporter: | Mark Callaghan | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: InnoDB Plugin storage engine | Severity: | S5 (Performance) | 
| Version: | 5.1.52 | OS: | Any | 
| Assigned to: | Marko Mäkelä | CPU Architecture: | Any | 
| Tags: | compressed, innodb | ||
   [28 May 2011 15:40]
   Mark Callaghan        
  I repeated some of the tests from https://www.facebook.com/notes/mysql-at-facebook/innodb-compression-for-read-only-workload... and noticed that for the compressed table tests I was warming the buffer pool by mistake prior to starting the test. After changing the test to use a cold buffer pool I noticed the problem.
   [28 May 2011 16:04]
   Mark Callaghan        
  It takes between 15 and 30 minutes to get out of this state which might be the amount of time it takes for the buffer pool to reach a steady state with length(unzip_LRU) about 10% of length(LRU)
   [30 May 2011 12:29]
   Marko Mäkelä        
  Good catch, Mark. The funny thing is that it turns out that the buf_pool->zip_clean list is not really needed for anything. It was going to be used for resizing the buffer pool, a feature that has still not been implemented. I tried removing zip_clean. The only dependence is in buf_get_latched_pages_number(), which would no longer include any clean (non-dirty) compressed-only blocks in the count. Luckily, this is only exposed in debug builds: #ifdef UNIV_DEBUG export_vars.innodb_buffer_pool_pages_latched = buf_get_latched_pages_number(); #endif /* UNIV_DEBUG */
   [11 Nov 2011 6:08]
   Mark Callaghan        
  Is this fixed in 5.1 or 5.5? We are using compression on a few servers and this causes lots of stalls that are hard to debug. Would be nice for others to get an official fix.
   [15 Nov 2011 12:26]
   Marko Mäkelä        
  Sorry Mark, it looks like that we forgot to update the bug status. This was fixed already in June, by putting zip_clean behind #ifdef UNIV_DEBUG or UNIV_BUF_DEBUG. It was merged in this changeset: revno: 3562 revision-id: kent.boortz@oracle.com-20110703154737-d27i4ypu2a0ran21 parent: kent.boortz@oracle.com-20110630153713-9mk02181m1d70g6o parent: georgi.kodinov@oracle.com-20110607124316-rfzy0xt3uopdbkvm committer: Kent Boortz <kent.boortz@oracle.com> branch nick: mysql-5.1 timestamp: Sun 2011-07-03 17:47:37 +0200 Judging from the revno, this narrowly missed the 5.1.59 release.
   [5 Jan 2012 20:15]
   John Russell        
  Added to the 5.1.60 changelog: The InnoDB buffer pool management code was optimized for handling pages from compressed tables. This fixes a slowdown that could occur particularly during the warmup period for the buffer pool.

Description: I have been running benchmarks using a read-only workload on compressed InnoDB tables. The database is not cached in the InnoDB buffer pool. So the test initially stores a compressed and uncompressed copy of every page (frame?) and then begins evicting the uncompressed copies when there are no more free pages. I noticed a performance dip during the test. In this case I am running with two user connections and at test start I get ~300 requests/second which is expected given the disks I use. However, throughput drops to ~60 requests/second after a few minutes and that should not occur. The common thread stack is: buf_LRU_insert_zip_clean,buf_LRU_free_block,buf_LRU_search_and_free_block,buf_LRU_get_free_block,buf_buddy_alloc_low,buf_page_init_for_read,buf_read_page_low,buf_read_page,buf_page_get_gen,btr_cur_search_to_nth_level,row_search_for_mysql,ha_innobase::index_read,handler::index_read_idx_map,join_read_const,join_read_const_table,make_join_statistics,JOIN::optimize,mysql_select,handle_select,execute_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_command So I looked at buf_LRU_insert_zip_clean and it has a do...while loop that can be O(N) on the size of the LRU. That will not scale. To confirm this is a problem I added count to count the number of loop iterations per call. Right now there are ~137,000 pages on the LRU and my extra code is reporting many instances of 50,000 or more iterations of the loop per call. /********************************************************************//** Insert a compressed block into buf_pool->zip_clean in the LRU order. */ UNIV_INTERN void buf_LRU_insert_zip_clean( /*=====================*/ buf_page_t* bpage) /*!< in: pointer to the block in question */ { buf_page_t* b; ut_ad(buf_pool_mutex_own()); ut_ad(buf_page_get_state(bpage) == BUF_BLOCK_ZIP_PAGE); /* Find the first successor of bpage in the LRU list that is in the zip_clean list. */ b = bpage; do { b = UT_LIST_GET_NEXT(LRU, b); } while (b && buf_page_get_state(b) != BUF_BLOCK_ZIP_PAGE); /* Insert bpage before b, i.e., after the predecessor of b. */ if (b) { b = UT_LIST_GET_PREV(list, b); } if (b) { UT_LIST_INSERT_AFTER(list, buf_pool->zip_clean, b, bpage); } else { UT_LIST_ADD_FIRST(list, buf_pool->zip_clean, bpage); } } How to repeat: Run a read-only workload on a compressed table. Watch throughput over time starting with a cold buffer pool. Initially length(LRU) ~= length(unzip_LRU) but once uncompressed pages must be evicted from the unzip_LRU the problem seems to begin.