Description:
During our usage, we observed a phenomenon where the last record of the prev
page is bigger than the first record of the next page.
DB_AdIndex_97/Tbl_AdIndex_0 is a compressed table:
```
2026-06-15T06:47:25.330933+08:00 0 [ERROR] InnoDB: btr_check_sibling_boundary:
last record on left page >= first record on right page! index
`FUId_FCreativeTemplateId` table DB_AdIndex_97/Tbl_AdIndex_0 left_page_no 14802482
right_page_no 14802487
2026-06-15T06:47:25.330960+08:00 0 [ERROR] InnoDB: left page last record:
PHYSICAL RECORD: n_fields 6; compact format; info bits 0
0: len 8; hex 0000000004fb21f9; asc ! ;;
1: len 4; hex 000002d1; asc ;;
2: len 8; hex 80000018e4c63d8d; asc = ;;
3: len 8; hex 800000191edf46c3; asc F ;;
4: len 8; hex 800000191edfc5b0; asc ;;
5: len 8; hex 800000191edfc15e; asc ^;;
2026-06-15T06:47:25.331085+08:00 0 [Note] InnoDB: n_owned: 0; heap_no: 166; next rec: 112
2026-06-15T06:47:25.331089+08:00 0 [ERROR] InnoDB: right page first record:
PHYSICAL RECORD: n_fields 6; compact format; info bits 0
0: len 8; hex 0000000004fb21f9; asc ! ;;
1: len 4; hex 000002d1; asc ;;
2: len 8; hex 80000018e4c63d8d; asc = ;;
3: len 8; hex 8000000000000000; asc ;;
4: len 8; hex 8000000000000000; asc ;;
5: len 8; hex 800000191edfcfb8; asc ;;
2026-06-15T06:47:25.331207+08:00 0 [Note] InnoDB: n_owned: 0; heap_no: 2; next rec: 174
```
Note: We added this check to verify the order of records between adjacent pages.
In the `zip_page_handler` function, we made an assumption that if
`access_time != 0`, the change buffer merge operation could be skipped. This was
based on the belief that the conditions `access_time != 0` and
`IBUF_BITMAP_BUFFERED != 0` would not coexist; however, in reality, such a
scenario is possible.
When an LRU eviction process discards an uncompressed frame, the page briefly
disappears from `page_hash`. During this window, another thread utilizes change
buffering to write an `ibuf` entry for the page. Subsequently, the "compressed-
only" descriptor is re-linked with a non-zero `access_time`; a later decompression
attempt skips the `ibuf` merge because `access_time != 0`, thereby delaying the
application of the pending `ibuf` entry until after the page boundary has
shifted, ultimately violating the pages' records order.
So, the chronological order of events is:
1. Thread B (eviction) enters buf_LRU_block_remove_hashed holding LRU_list_mutex +
hash_lock.
2. Thread B removes the page via HASH_DELETE.
3. Thread B calls rw_lock_x_unlock(hash_lock) ← window opens
(still holding LRU_list_mutex).
4. Thread A (DML) at this point goes through btr0cur and calls buf_page_get_gen
(BUF_GET_IF_IN_POOL); holding only hash_lock, it finds nothing.
5. Thread A enters ibuf_insert → buf_page_get_also_watch: holding only hash_lock,
finds nothing → passes.
6. Thread A enters ibuf_insert_low → buf_page_peek: holding only hash_lock,
finds nothing → passes.
7. Thread A sets BUFFERED=1 and writes the entry into the ibuf.
8. Thread B re-acquires hash_lock at buf_LRU_free_page and, HASH_INSERTs b —
the compressed descriptor carrying the stale access_time — back into the page
hash.
Afterwards, the page may undergo a split, after which the page_id that the ibuf
entry was buffered against no longer holds.
The underlying defect is that access_time == 0 is overloaded to mean "this
incarnation still needs an ibuf merge," but access_time is freely overwritten by
LRU/read-ahead/zip-access bookkeeping and inherited across uncompressed-frame
eviction, so it is not a reliable gate for the change-buffer merge.
How to repeat:
It's hard to repeat, better to analyze the code.
Suggested fix:
1. set access_time = 0 when free compressed page
2. call ibuf_merge_or_delete_for_page when IBUF_BITMAP_BUFFERED = 1, rather than rely on access_time.
Description: During our usage, we observed a phenomenon where the last record of the prev page is bigger than the first record of the next page. DB_AdIndex_97/Tbl_AdIndex_0 is a compressed table: ``` 2026-06-15T06:47:25.330933+08:00 0 [ERROR] InnoDB: btr_check_sibling_boundary: last record on left page >= first record on right page! index `FUId_FCreativeTemplateId` table DB_AdIndex_97/Tbl_AdIndex_0 left_page_no 14802482 right_page_no 14802487 2026-06-15T06:47:25.330960+08:00 0 [ERROR] InnoDB: left page last record: PHYSICAL RECORD: n_fields 6; compact format; info bits 0 0: len 8; hex 0000000004fb21f9; asc ! ;; 1: len 4; hex 000002d1; asc ;; 2: len 8; hex 80000018e4c63d8d; asc = ;; 3: len 8; hex 800000191edf46c3; asc F ;; 4: len 8; hex 800000191edfc5b0; asc ;; 5: len 8; hex 800000191edfc15e; asc ^;; 2026-06-15T06:47:25.331085+08:00 0 [Note] InnoDB: n_owned: 0; heap_no: 166; next rec: 112 2026-06-15T06:47:25.331089+08:00 0 [ERROR] InnoDB: right page first record: PHYSICAL RECORD: n_fields 6; compact format; info bits 0 0: len 8; hex 0000000004fb21f9; asc ! ;; 1: len 4; hex 000002d1; asc ;; 2: len 8; hex 80000018e4c63d8d; asc = ;; 3: len 8; hex 8000000000000000; asc ;; 4: len 8; hex 8000000000000000; asc ;; 5: len 8; hex 800000191edfcfb8; asc ;; 2026-06-15T06:47:25.331207+08:00 0 [Note] InnoDB: n_owned: 0; heap_no: 2; next rec: 174 ``` Note: We added this check to verify the order of records between adjacent pages. In the `zip_page_handler` function, we made an assumption that if `access_time != 0`, the change buffer merge operation could be skipped. This was based on the belief that the conditions `access_time != 0` and `IBUF_BITMAP_BUFFERED != 0` would not coexist; however, in reality, such a scenario is possible. When an LRU eviction process discards an uncompressed frame, the page briefly disappears from `page_hash`. During this window, another thread utilizes change buffering to write an `ibuf` entry for the page. Subsequently, the "compressed- only" descriptor is re-linked with a non-zero `access_time`; a later decompression attempt skips the `ibuf` merge because `access_time != 0`, thereby delaying the application of the pending `ibuf` entry until after the page boundary has shifted, ultimately violating the pages' records order. So, the chronological order of events is: 1. Thread B (eviction) enters buf_LRU_block_remove_hashed holding LRU_list_mutex + hash_lock. 2. Thread B removes the page via HASH_DELETE. 3. Thread B calls rw_lock_x_unlock(hash_lock) ← window opens (still holding LRU_list_mutex). 4. Thread A (DML) at this point goes through btr0cur and calls buf_page_get_gen (BUF_GET_IF_IN_POOL); holding only hash_lock, it finds nothing. 5. Thread A enters ibuf_insert → buf_page_get_also_watch: holding only hash_lock, finds nothing → passes. 6. Thread A enters ibuf_insert_low → buf_page_peek: holding only hash_lock, finds nothing → passes. 7. Thread A sets BUFFERED=1 and writes the entry into the ibuf. 8. Thread B re-acquires hash_lock at buf_LRU_free_page and, HASH_INSERTs b — the compressed descriptor carrying the stale access_time — back into the page hash. Afterwards, the page may undergo a split, after which the page_id that the ibuf entry was buffered against no longer holds. The underlying defect is that access_time == 0 is overloaded to mean "this incarnation still needs an ibuf merge," but access_time is freely overwritten by LRU/read-ahead/zip-access bookkeeping and inherited across uncompressed-frame eviction, so it is not a reliable gate for the change-buffer merge. How to repeat: It's hard to repeat, better to analyze the code. Suggested fix: 1. set access_time = 0 when free compressed page 2. call ibuf_merge_or_delete_for_page when IBUF_BITMAP_BUFFERED = 1, rather than rely on access_time.