Description:
buf_pool_get_dirty_pages_count() locks both buffer pool and its flush list mutexes. But it only needs the latter. The former was added by [1] to avoid seeing BUF_BLOCK_REMOVE_HASH pages in the flush list in the middle of relocation. But there is nothing wrong in seeing them and in fact they can be considered to be BUF_BLOCK_FILE_PAGE here, i.e. check if dirty and bump count. This would correct I believe and not unlike how buf_flush_validate_low() handles BUF_BLOCK_REMOVE_HASH pages.
I realize that redundant locking for one code path in debug builds is not a very critical bug, but am reporting it since happened to find it.
[1]
5.6$ bzr log -r 3690.34.32
------------------------------------------------------------
revno: 3690.34.32
committer: Sunny Bains <Sunny.Bains@Oracle.Com>
branch nick: trunk
timestamp: Thu 2012-05-31 19:09:30 +1000
message:
Bug#14135054 ASSERT IN BUF_PAGE_IN_FILE(BPAGE) DURING DROP DATABASE
Needs the buffer pool mutex to be held for the duration of the fetch count
of pages matching some space id. When we relocate a page, first we change the
state to remove from hash and then remove the page from the flush list.
We release the page mutex after changing the state, therefore holding it
is not sufficient. We must hold the buffer pool mutex while scanning the
flush list. Note: This is debug code.
Approved by Marko over IM
How to repeat:
Code analysis
Suggested fix:
In buf_pool_get_dirty_pages_count(), remove buf_pool_mutex_enter()/buf_pool_mutex_exit() calls, replace ut_ad(buf_page_in_file(...)) with ut_ad(... || buf_page_get_state(bpage) == BUF_BLOCK_REMOVE_HASH)