Bug #116647 Bufferpool Index Stats are not decremented on btr_page_free
Submitted: 13 Nov 2024 19:28 Modified: 14 Nov 2024 7:10
Reporter: Pranay Motupalli Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.4.2,9.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: BufferPool

[13 Nov 2024 19:28] Pranay Motupalli
Description:
Bufferpool Index Stats are not decremented on page free. The current code only decrements the counter when the page is being evicted from LRU. However, when a btree page is freed, it should be decremented. The current logic to increment on btr_page_create keeps incrementing when the freed page is reused and this can lead to wrong stats.

As per https://dev.mysql.com/worklog/task/?id=7170, 

There are two places that cause a page to enter the buffer pool:

    When an existent page is read from disk. I chose to plant the increment in buf_page_monitor(), called from buf_page_io_complete().

    When a new page is created (as a result from an INSERT). I chose to plant the increment in btr_page_create().

Removal of a page from the buffer pool can happen on two occasions too:

    When a page is not accessed for some time and is pushed away from the LRU by other pages. Then it is either flushed to the disk if there are modifications to it or just evicted.

    When a page is deleted (as a result from DELETE).   -------> This is not happening.

How to repeat:
mysql> drop table if exists test.t1;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> create table test.t1(id varchar(200));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into test.t1 values ("sdfasdfsdfsdafasdfasdsafsadfasdfsdafadsfs");
Query OK, 1 row affected (0.01 sec)

mysql> insert into test.t1 select * from test.t1;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 8 rows affected (0.01 sec)
Records: 8  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 16 rows affected (0.00 sec)
Records: 16  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 32 rows affected (0.00 sec)
Records: 32  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 64 rows affected (0.01 sec)
Records: 64  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 128 rows affected (0.00 sec)
Records: 128  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 256 rows affected (0.01 sec)
Records: 256  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 512 rows affected (0.01 sec)
Records: 512  Duplicates: 0  Warnings: 0

mysql> SELECT
    ->   tables.NAME AS table_name,
    ->   indexes.NAME AS index_name,
    ->   cached.N_CACHED_PAGES AS n_cached_pages
    -> FROM
    ->   INFORMATION_SCHEMA.INNODB_CACHED_INDEXES AS cached,
    ->   INFORMATION_SCHEMA.INNODB_INDEXES AS indexes,
    ->   INFORMATION_SCHEMA.INNODB_TABLES AS tables
    -> WHERE
    ->   cached.INDEX_ID = indexes.INDEX_ID
    ->   AND indexes.TABLE_ID = tables.TABLE_ID;
+------------+-----------------+----------------+
| table_name | index_name      | n_cached_pages |
+------------+-----------------+----------------+
| test/t1    | GEN_CLUST_INDEX |              7 |
+------------+-----------------+----------------+
1 row in set (0.00 sec)

mysql>   select count(*) from information_schema.innodb_buffer_page where table_name='`test`.`t1`';
+----------+
| count(*) |
+----------+
|        7 |
+----------+
1 row in set (0.01 sec)

mysql> delete from test.t1;
Query OK, 1024 rows affected (0.01 sec)

mysql>  select count(*) from information_schema.innodb_buffer_page where table_name='`test`.`t1`';
+----------+
| count(*) |
+----------+
|        7 |
+----------+
1 row in set (0.01 sec)

mysql> SELECT
    ->   tables.NAME AS table_name,
    ->   indexes.NAME AS index_name,
    ->   cached.N_CACHED_PAGES AS n_cached_pages
    -> FROM
    ->   INFORMATION_SCHEMA.INNODB_CACHED_INDEXES AS cached,
    ->   INFORMATION_SCHEMA.INNODB_INDEXES AS indexes,
    ->   INFORMATION_SCHEMA.INNODB_TABLES AS tables
    -> WHERE
    ->   cached.INDEX_ID = indexes.INDEX_ID
    ->   AND indexes.TABLE_ID = tables.TABLE_ID;
+------------+-----------------+----------------+
| table_name | index_name      | n_cached_pages |
+------------+-----------------+----------------+
| test/t1    | GEN_CLUST_INDEX |              7 |
+------------+-----------------+----------------+
1 row in set (0.01 sec)

mysql> insert into test.t1 values ("sdfasdfsdfsdafasdfasdsafsadfasdfsdafadsfs");
Query OK, 1 row affected (0.00 sec)

mysql> insert into test.t1 select * from test.t1;
Query OK, 1 row affected (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 8 rows affected (0.01 sec)
Records: 8  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 16 rows affected (0.00 sec)
Records: 16  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 32 rows affected (0.01 sec)
Records: 32  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 64 rows affected (0.00 sec)
Records: 64  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 128 rows affected (0.01 sec)
Records: 128  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 256 rows affected (0.00 sec)
Records: 256  Duplicates: 0  Warnings: 0

mysql> insert into test.t1 select * from test.t1;
Query OK, 512 rows affected (0.00 sec)
Records: 512  Duplicates: 0  Warnings: 0

mysql>  select count(*) from information_schema.innodb_buffer_page where table_name='`test`.`t1`';
+----------+
| count(*) |
+----------+
|        7 |
+----------+
1 row in set (0.02 sec)

mysql> SELECT
    ->   tables.NAME AS table_name,
    ->   indexes.NAME AS index_name,
    ->   cached.N_CACHED_PAGES AS n_cached_pages
    -> FROM
    ->   INFORMATION_SCHEMA.INNODB_CACHED_INDEXES AS cached,
    ->   INFORMATION_SCHEMA.INNODB_INDEXES AS indexes,
    ->   INFORMATION_SCHEMA.INNODB_TABLES AS tables
    -> WHERE
    ->   cached.INDEX_ID = indexes.INDEX_ID
    ->   AND indexes.TABLE_ID = tables.TABLE_ID;
+------------+-----------------+----------------+
| table_name | index_name      | n_cached_pages |
+------------+-----------------+----------------+
| test/t1    | GEN_CLUST_INDEX |             13 |
+------------+-----------------+----------------+
1 row in set (0.00 sec)

As you can see here, though the number of pages in buffer is same before and after re-insertion, the bufferpool per index stats denote as it they are doubled. As the current behavior increments on btr_create_page ( which can be called while reusing the existing freed page ) and not decrementing the counter during btr_free_page.

Suggested fix:
Author: Pranay Motupalli <pranay@Pranays-Mac-mini.local>
Date:   Wed Nov 13 11:05:23 2024 -0800

    Decrement Bufferpool Per Index Stats during btr_free_page

diff --git a/storage/innobase/btr/btr0btr.cc b/storage/innobase/btr/btr0btr.cc
index 061a6f11fd2..eb1fe67f21d 100644
--- a/storage/innobase/btr/btr0btr.cc
+++ b/storage/innobase/btr/btr0btr.cc
@@ -607,6 +607,10 @@ void btr_page_free(dict_index_t *index, /*!< in: index tree */
   ut_ad(fil_page_index_page_check(block->frame));
   ut_ad(level != ULINT_UNDEFINED);
   btr_page_free_low(index, block, level, mtr);
+
+  if (level == 0) {
+    buf_stat_per_index->dec(index_id_t(index->space, index->id));
+  }
 }

 /** Sets the child node file address in a node pointer. */
[14 Nov 2024 7:10] MySQL Verification Team
Hello Pranay,

Thank you for the report and feedback.

regards,
Umesh
[25 Apr 2025 16:55] Morgan Tocker
Cross-linking to https://bugs.mysql.com/bug.php?id=112075 where this was previously reported and closed as a docs bug.

(It's not a docs bug. These statistics are used by the optimizer to compute costs).
[12 Jun 12:43] Jakub Lopuszanski
Posted by developer:
 
I believe this is the same bug as previously reported as
[Bug#30164033: N_CACHED_PAGES IN INFORMATION_SCHEMA.INNODB_CACHED_INDEXES ONLY INCREASES](https://mybug.mysql.oraclecorp.com/orabugs/site/bug.php?id=30164033)
and
[Bug#35718252: innodb_cached_indexes.n_cached_pages reports crazy high numbers](https://mybug.mysql.oraclecorp.com/orabugs/site/bug.php?id=35718252)

The problem seems real, and has to do with the fact that there are several different concepts we often refer to as "freeing a page", and they got mixed up in such a way, that a page might get freed in the sense of being removed from B-tree index, but still remain in the BP's hashtable addressed by its page_id, i.e. not added to the BP's free list. Then it could be added back to the B-tree, by btr_page_create(), which would inc() the stats again(), leading to double-counting it. The dec()s happen only when a page buffer is moved to the BP's free list.

The intent from the code seems to be to properly track the number of pages from a given index which are currently in the BP's hashmap, which is in line with the old documentation, it's just that this edge case of reusing a page was missed.
I'll be fixing it using Bug#30164033 which is the oldest bug report for this issue which I could find.