Bug #75503 buf_pool_watch_set would crash trying to use a second watch page per instance?
Submitted: 14 Jan 2015 11:27 Modified: 23 Jun 2015 13:52
Reporter: Laurynas Biveinis (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:all OS:Any
Assigned to: CPU Architecture:Any
Tags: innodb

[14 Jan 2015 11:27] Laurynas Biveinis
Description:
Each buffer pool instance has an array watch[] of buf_page_t. They are BUF_BLOCK_ZIP_PAGE when free and BUF_BLOCK_POOL_WATCH when used as a set watch sentinel.

Then we have the following code in buf_pool_watch_set:
...
	for (i = 0; i < BUF_POOL_WATCH_SIZE; i++) {
		bpage = &buf_pool->watch[i];
...
		switch (bpage->state) {
		case BUF_BLOCK_POOL_WATCH:
/* set watch */
			return(NULL);
		case BUF_BLOCK_ZIP_PAGE:
...
			break;
...
		}
	}
...
	ut_error;

Which shows that buf_pool_watch_set should crash when trying to set a watch on an instance with one watch already set.

How to repeat:
Code reading. Maybe some workload with innodb_buffer_pool_instances=1 innodb_purge_threads=32.

Suggested fix:
Looks like a typo: s/break/continue
[16 Jan 2015 19:38] Laurynas Biveinis
The patch uploaded to bug 75534 includes a fix for this bug.
[27 Jan 2015 13:51] MySQL Verification Team
This is indeed a bug, hence it is verified. It would be very hard to make a test case.

We here have a break within the  switch (bpage->state) {...}, but the for (;;) loop continues thereafter. So, the only way to repeat a bug would be when ALL buffer pool pages are in the state  BUF_BLOCK_ZIP_PAGE. In that case ONLY, we would have a crash.

Still, it should be fixed.
[28 Jan 2015 11:52] Laurynas Biveinis
Ah, my bad, the "break" of course belongs to switch, not for. There is no bug here then. Leaving Verified as you want to fix the case of a buffer pool instance having all watch pages used up (which is impossible right now?).
[20 Mar 2015 8:35] Marko Mäkelä
Posted by developer:
 
FWIW, originally there was only watch sentinel object. Later, when multiple purge threads were introduced, the array was introduced too, so all purge threads can set a watch concurrently if needed.
Thiru claimed that he double-checked the logic that a watch that has been set will always be reset before returning from the purge operation. So, it is not possible that the number of watches exceeds the number of purge threads.

Therefore, we can close this bug by adding a comment for clarification.
[8 Apr 2015 19:43] Daniel Price
Posted by developer:
 
Not a bug. The only change is the addition of a code comment. No changelog entry required.
[23 Jun 2015 13:52] Laurynas Biveinis
commit f8eacccf2a56e86ed5c6413481d5ea918b9eca1d
Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
Date:   Tue Mar 24 14:09:18 2015 +0530

    Bug#20422680        BUF_POOL_WATCH_SET WOULD CRASH TRYING
                        TO USE A SECOND WATCH PAGE PER INSTANCE
    
    Description:
        BUF_POOL_WATCH_SIZE is also initialized to number of purge threads.
    so BUF_POOL_WATCH_SIZE will never be lesser than number of purge threads.
    From the code, there is no scope for purge thread to skip buf_pool_watch_unset.
    So there can be at most one buffer pool watch active per purge thread.
    In other words, there is no chance for purge thread instance to hold a watch
    when setting another watch.
    
    Solution:
        Adding code comments to clarify the issue.
    
    Reviewed-by: Marko Makela <marko.makela@oracle.com>
    Approved via Bug page.