MySQL Bugs: #90587: InnoDB: Assertion failure in thread 47255404480256 in file buf0lru.cc line 2213

Bug #90587	InnoDB: Assertion failure in thread 47255404480256 in file buf0lru.cc line 2213
Submitted:	24 Apr 2018 2:37	Modified:	10 Feb 2020 13:05
Reporter:	raolh rao	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S3 (Non-critical)
Version:	5.7.21	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
When server crash happened, functions call stack is like this:

/u01/mysql/bin/mysqld[0x7733bd]
/u01/mysql/bin/mysqld[0x113c302]
/u01/mysql/bin/mysqld(_Z21buf_LRU_free_one_pageP10buf_page_t+0x17b)[0x113ebab]
/u01/mysql/bin/mysqld[0x1145576]
/u01/mysql/bin/mysqld(_Z13buf_read_pageRK9page_id_tRK11page_size_tP5trx_t+0x2d3)[0x11476c3]
/u01/mysql/bin/mysqld(_Z16buf_page_get_genRK9page_id_tRK11page_size_tmP11buf_block_tmPKcmP5mtr_tb+0x3fb)[0x111997b]
/u01/mysql/bin/mysqld[0x10ff8ec]
/u01/mysql/bin/mysqld(_Z32btr_copy_externally_stored_fieldPmPKhRK11page_size_tmP16mem_block_info_t+0x197)[0x10ffe47]
/u01/mysql/bin/mysqld[0x10509a2]
/u01/mysql/bin/mysqld[0x1053ba8]
/u01/mysql/bin/mysqld(_Z15row_search_mvccPh15page_cur_mode_tP14row_prebuilt_tmm+0x1d2b)[0x105674b]
/u01/mysql/bin/mysqld(_ZN11ha_innobase10index_readEPhPKhj16ha_rkey_function+0x29f)[0xf5302f]
/u01/mysql/bin/mysqld(_ZN11ha_innobase11index_firstEPh+0x30)[0xf44d60]
/u01/mysql/bin/mysqld(_ZN7handler14ha_index_firstEPh+0x164)[0x7f6264]
/u01/mysql/bin/mysqld(_ZN7handler16read_range_firstEPK12st_key_rangeS2_bb+0xbc)[0x7fa4ac]
/u01/mysql/bin/mysqld(_ZN7handler21multi_range_read_nextEPPc+0x99)[0x7efb19]
/u01/mysql/bin/mysqld(_ZN18QUICK_RANGE_SELECT8get_nextEv+0x4b)[0xdc216b]
/u01/mysql/bin/mysqld[0xbfa26a]
/u01/mysql/bin/mysqld(_ZN14Sql_cmd_delete12mysql_deleteEP3THDy+0x1065)[0xdfb895]
/u01/mysql/bin/mysqld(_ZN14Sql_cmd_delete7executeEP3THD+0xbf)[0xdfc03f]
/u01/mysql/bin/mysqld(_Z21mysql_execute_commandP3THDb+0xa71)[0xc8a9d1]
/u01/mysql/bin/mysqld(_Z11mysql_parseP3THDP12Parser_state+0x395)[0xc907c5]
/u01/mysql/bin/mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0x92a)[0xc911aa]
/u01/mysql/bin/mysqld(_Z10do_commandP3THD+0x177)[0xc92ad7]
/u01/mysql/bin/mysqld(_Z26threadpool_process_requestP3THD+0xcf)[0xd3525f]
/u01/mysql/bin/mysqld[0xd4627e]
/u01/mysql/bin/mysqld(pfs_spawn_thread+0x1b1)[0xed5581]
/lib64/libpthread.so.0(+0x7dc5)[0x2af8fdcbadc5]
/lib64/libc.so.6(clone+0x6d)[0x2af8ff021ced]

which means that when trying to  flush a page out of buffer pool, the page is still accessed by a thread (buf_fix_count > 0).

I think that is a race condition on  bpage->buf_fix_count，in this case one functions call stack is like this:
buf_read_page_handle_error
  -> buf_LRU_free_one_page
    -> buf_LRU_block_remove_hashed

“buf_read_page_handle_error” first assert condition  "ut_ad(bpage->buf_fix_count == 0)," , then "buf_LRU_block_remove_hashed" assert again "ut_a(bpage->buf_fix_count == 0);",  but I can't find out  any lock protection can stop other thread to access the same page.

e.g. during buf_read_page_handle_error's assert,  it holds bufferpool mutex and page mutex,  but it release page mutex before enter function buf_LRU_free_one_page.

while other thread could be  in this functions call stack
buf_page_get_gen
  -> buf_pool_watch_set
    -> buf_block_fix

when calling buf_pool_watch_set in buf_page_get_gen, it hold hash mutex and page mutex, but don't  hold bufferpool lock.

So,  once a thread in buf_read_page_handle_error release page mutex before entering buf_LRU_free_one_page (only hold buffer pool mutex, and no hash lock and page lock),
 another thread could access same page  in buf_page_get_gen->buf_pool_watch_set ( hold hash mutex and page mutex,  and no buffer pool mutex), and increase page's buf_fix_count ,  which leading assert fail in buf_LRU_block_remove_hashed's "ut_a(bpage->buf_fix_count == 0);" of other thread.

That is one case I can figure out, maybe there are some other cases. I find page's buf_fix_count is changed atomic, but I wonder what can promise that when a page is flushed out from buffer pool, it cann't be accessing by other thread at the same time.

How to repeat:
Hard to repeat

Suggested fix:
No

Hi,

Thank you for your bug report.

We do have several additional questions for you ....

* This looks more like an assert then a simple crash. Can you check it out

* Could you try to repeat this with debug binary as we are missing lot's of data ??

* Do you use data encryption for InnoDB tables ??

* Do you use compression for the same table ??

* Could you be more specific as to how could that race condition occur ???

* This could be simple glitch in RAM, since thread number is likely impossible to reach. What was the uptime that would make server reach such a huge number ...

* Are you using our binaries ???

* Could you locate the DELETE that caused this ???

* If you could repeat this with debug binary and core dumping enabled, we could find out how could this occur .......

Many thanks in advance ......

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

I hit the same issue on MySQL 8.0.18. This is the commit:

commit 91a17cedb1ee880fe7915fb14cfd74c04e8d6588
Author: mysql-builder@oracle.com <>
Date:   Fri Sep 20 10:09:56 2019 +0200

The assert ut_a(bpage->buf_fix_count == 0) fails:

1983 static bool buf_LRU_block_remove_hashed(buf_page_t *bpage, bool zip,
1984                                         bool ignore_content) {
1985   const buf_page_t *hashed_bpage;
1986   buf_pool_t *buf_pool = buf_pool_from_bpage(bpage);
1987   rw_lock_t *hash_lock;
1988 
1989   ut_ad(mutex_own(&buf_pool->LRU_list_mutex));
1990   ut_ad(mutex_own(buf_page_get_mutex(bpage)));
1991 
1992   hash_lock = buf_page_hash_lock_get(buf_pool, bpage->id);
1993 
1994   ut_ad(rw_lock_own(hash_lock, RW_LOCK_X));
1995 
1996   ut_a(buf_page_get_io_fix(bpage) == BUF_IO_NONE);
1997   ut_a(bpage->buf_fix_count == 0);

The code path is the following:

buf_read_page_low->buf_page_io_complete->buf_read_page_handle_error->buf_LRU_free_one_page->buf_LRU_block_remove_hashed.

The assert fails because there is another thread wants to read the same page and put a fix_count on the page

3453 dberr_t Buf_fetch_normal::get(buf_block_t *&block) {
3454   /* Keep this path as simple as possible. */
3455   for (;;) {
3456     /* Lookup the page in the page hash. If it doesn't exist in the
3457     buffer pool then try and read it in from disk. */
3458 
3459     block = lookup();
3460 
3461     if (block != nullptr) {
3462       buf_block_fix(block);
3463 
3464       /* Now safe to release page_hash S lock. */
3465       rw_lock_s_unlock(m_hash_lock);
3466       break;
3467     }
3468 
3469     /* Page not in buf_pool: needs to be read from file */
3470     read_page();
3471   }

How to repeat:

Short of encouting an actual page corruption, you will need to use debugger to force the code into buf_page_io_complete->buf_read_page_handle_error

When thread A finishes reading a page and enters buf_page_io_complete(), use debugger to force "is_corrupted" to true. Suspend thread A

5182     /* From version 3.23.38 up we store the page checksum
5183     to the 4 first bytes of the page end lsn field */
5184     bool is_corrupted;
5185     {
5186       BlockReporter reporter =
5187           BlockReporter(true, frame, bpage->size,
5188                         fsp_is_checksum_disabled(bpage->id.space()));
5189       is_corrupted = reporter.is_corrupted();
5190     }
5191 
5192     if (compressed_page || is_corrupted) {

Let thread B read the same page, and suspect thread B after it adds the fix_count in Buf_fetch_normal::get()

Let thread A continue, it will hit the assert failure.

We can not accept changing values of variables in the runtime through debugger.

That way you can prove anything.

If you don't want to force the value in debugger, then you can instead try getting into buf_page_io_complete->buf_read_page_handle_error() code path using other methods. For example, intentionly corrupt the page on disk.

Hi,

Still no good enough. Intentionally corrupting the page would only prove that you can force MySQL server to assert if you edit the file in hex editor or similar.

We need a fully repeatable test case, which entails a set of SQL statements that always lead to the reported behaviour.

if you encounter this assertion,  can anybody remember if they had large-pages enabled for MySQL ?