MySQL Bugs: #70406: buf_pool_get_dirty_pages_count() needs not to lock the buffer pool mutex

Bug #70406	buf_pool_get_dirty_pages_count() needs not to lock the buffer pool mutex
Submitted:	24 Sep 2013 9:31	Modified:	25 Sep 2013 7:28
Reporter:	Laurynas Biveinis (OCA)	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S3 (Non-critical)
Version:	5.6	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
buf_pool_get_dirty_pages_count() locks both buffer pool and its flush list mutexes.  But it only needs the latter.  The former was added by [1] to avoid seeing BUF_BLOCK_REMOVE_HASH pages in the flush list in the middle of relocation.  But there is nothing wrong in seeing them and in fact they can be considered to be BUF_BLOCK_FILE_PAGE here, i.e. check if dirty and bump count.  This would correct I believe and not unlike how buf_flush_validate_low() handles BUF_BLOCK_REMOVE_HASH pages.

I realize that redundant locking for one code path in debug builds is not a very critical bug, but am reporting it since happened to find it.

[1]

5.6$ bzr log -r 3690.34.32
------------------------------------------------------------
revno: 3690.34.32
committer: Sunny Bains <Sunny.Bains@Oracle.Com>
branch nick: trunk
timestamp: Thu 2012-05-31 19:09:30 +1000
message:
  Bug#14135054 ASSERT IN BUF_PAGE_IN_FILE(BPAGE) DURING DROP DATABASE
              
  Needs the buffer pool mutex to be held for the duration of the fetch count
  of pages matching some space id. When we relocate a page, first we change the
  state to remove from hash and then remove the page from the flush list.
  We release the page mutex after changing the state, therefore holding it
  is not sufficient. We must hold the buffer pool mutex while scanning the
  flush list. Note: This is debug code.
  					        
  Approved by Marko over IM

How to repeat:
Code analysis

Suggested fix:
In buf_pool_get_dirty_pages_count(), remove buf_pool_mutex_enter()/buf_pool_mutex_exit() calls, replace ut_ad(buf_page_in_file(...)) with ut_ad(... || buf_page_get_state(bpage) == BUF_BLOCK_REMOVE_HASH)

Hello Laurynas,

Thank you for the report.

Thanks,
Umesh

For the suggested fix, buf_pool_get_dirty_pages_count() also needs to replace buf_page_get_space() with direct bpage->space.

Which starts to look like a lot of trouble for a very minor bug :)