Bug #85208 A follow-up fix for buffer pool mutex split patch might be suboptimal, commit 2
Submitted: 27 Feb 2017 13:35 Modified: 1 Mar 2017 15:26
Reporter: Laurynas Biveinis (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0.0 OS:Any
Assigned to: CPU Architecture:Any

[27 Feb 2017 13:35] Laurynas Biveinis
Description:
Another follow-up commits to the buffer pool mutex split patch (bug 75534) is [1]. We have debugged what seems to be exact same bug and believe a fix with less locking is possible.

The bug is one thread storing a buffer pool block pointer for optimistic access, with all the buffer pool latches released, and another thread freeing the page and reading in another page. Then it becomes possible that the 1st thread calls buf_page_optimistic_get, buffer-fixes the page, the 2nd thread reads in a different data page and resets the buf fix count to zero, and, the 1st thread unfixes the page, resulting in the count of UINT32_MAX.

[1] avoids the above by putting buffer pool page initialisation into a block mutex critical section, serializing it with buf_page_optimistic_get. But it is enough to compare modify_clock value instead without adding any new extra serialization. It is bumped in buf_LRU_free_page too, noting the file page -> empty transition, in a buffer block mutex critical section. Thus by adding an early modify_clock check in the first block mutex critical section of buf_page_optimistic_get, we use the already-existing serialisation between buf_page_optimistic_get and buf_LRU_free_page critical sections, and whenever the latter executes before the former, we take an early exit out of buf_page_optimistic_get, avoiding any buf_fix_count manipulation altogether. If buf_page_optimistic_get runs first, then the page is buffer-fixed by leaving the critical section, and buf_LRU_free_page will not evict it.

I'd also replace the buf_fix_count reset to zero in buf_page_init_low with a release build assert checking the same.

(Also in [1], I'd check if the added assert "ut_ad(count + 1 != 0)" to catch buf_block_unfix overflow would not be subject to some type promotion rules making it always false - in my debugging I could get it to fire only by rewriting it as "ut_ad(count != UINT32_MAX);")

[1]:

commit cbc524750c486e9fde8266439979476964917bbb
Author: Shaohua Wang <shaohua.wang@oracle.com>
Date:   Wed Jun 8 10:29:10 2016 +0800

    BUG#23067038 ASSERTION FAILURE: BUF0BUF.CC:2861:BUF_PAGE_IN_FILE(BPAGE)
                 LEADS TO CORRUPT DATA
    
    It's a regression of wl#8423 InnoDB: Split the buffer pool mutex.
    
    There is a race: Thread 1, we set buf_fix_count to 0 in buf_page_init().
    Thread 2, we decrease buf_fix_count to 0xffffffff in buf_page_optimistic_get().
    Thread 2, we increase buf_fix_count to 0 in buf_page_get_gen(). Thread 3,
    we evict the page from LRU list. Thread 2, assert fails: buf_page_in_file().
    
    The root cause is missing block mutex protection for buf_page_init().
    
    Reviewed-by: Debarun Banerjee <debarun.banerjee@oracle.com>
    RB: 12456

How to repeat:
Code review
[27 Feb 2017 13:36] Laurynas Biveinis
See also bug 85205
[1 Mar 2017 15:26] Sinisa Milivojevic
Hi,

This bug is similar to #85205. Although my conclusions are the same.

Verified, based on the code analysis.