Bug #116305 | row_search_mvcc process wrong page of small buffer pool | ||
---|---|---|---|
Submitted: | 6 Oct 11:05 | Modified: | 9 Oct 13:41 |
Reporter: | zkong kong | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 9.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[6 Oct 11:05]
zkong kong
[7 Oct 8:44]
zhai weixiang
bug#116303 is duplicate of this one
[7 Oct 9:05]
zhai weixiang
The scenario descrpition can refer to that in bug#116303. This report has a mini mistake that comparing page id should be done before reseting page id to UINX_MAX and adding to free list.
[7 Oct 9:51]
MySQL Verification Team
Hi Mr. kong, Thank you very much for your bug report. Before we proceed, we do need to clear some issues regarding this bug report. First of all, is it related to the single instance of the MySQL Server running or to several connected instances. Second, we accept bug reports, but we test them on our official binaries, which are based on our official and unchanged source code. Hence, we have to know if we can repeat the problem with our official binaries. Thank you very much , in advance ........
[7 Oct 11:38]
zkong kong
Hi: First it's single instance. Second without patch, it's difficult to reproduction the problem stably. Our stress test encounter once several days.
[7 Oct 11:50]
MySQL Verification Team
Hi Mr. kong, zhai, We have got a request from a Development regarding this bug report. They wish if you could use the DBUG_SYNC facility to develop mtr test to demonstrate the problem. It would be easier to reason with that way and easier for us to fix. Thank you very much in advance.
[7 Oct 11:52]
MySQL Verification Team
Hi All, Also, as soon as we test that MTR test case, we shall verify a bug. Thanks in advance.
[7 Oct 12:07]
MySQL Verification Team
Hi, Here is an example on how to use DEBUG_SYNC: --- a/storage/innobase/ddl/ddl0par-scan.cc +++ b/storage/innobase/ddl/ddl0par-scan.cc @@ -237,6 +237,7 @@ dberr_t Parallel_cursor::scan(Builders &builders) noexcept { thread_ctx->get_state() != Parallel_reader::State::THREAD) { thread_ctx->savepoint(); latches_released = true; + DEBUG_SYNC_C("ddl_bulk_inserter_latches_released"); } return DB_SUCCESS; }); Build debug version of server and run the following test case for MTR framework: CREATE TABLE t1 (pk CHAR(5) PRIMARY KEY); INSERT INTO t1 VALUES ('aaaaa'), ('bbbbb'), ('bbbcc'), ('ccccc'), ('ddddd'), ('eeeee'); set global innodb_purge_stop_now=ON; DELETE FROM t1 WHERE pk = 'bbbcc'; SET DEBUG='+d,ddl_buf_add_two'; SET DEBUG_SYNC='ddl_bulk_inserter_latches_released SIGNAL latches_released WAIT_FOR go'; --send ALTER TABLE t1 ENGINE=InnoDB, ALGORITHM=INPLACE --connection default SET DEBUG_SYNC='now WAIT_FOR latches_released'; SET GLOBAL innodb_purge_run_now=ON; --source include/wait_innodb_all_purged.inc SET DEBUG_SYNC='now SIGNAL go'; --connection con1 --reap SET DEBUG='-d,ddl_buf_add_two';
[7 Oct 12:09]
MySQL Verification Team
HI, Here is another example: ```cpp --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -12791,6 +12791,8 @@ static bool mysql_inplace_alter_table( DEBUG_SYNC(thd, "alter_table_inplace_after_lock_downgrade"); THD_STAGE_INFO(thd, stage_alter_inplace); + DBUG_EXECUTE_IF("alter_table_inplace_after_lock_sleep", sleep(100)); + if (table->file->ha_inplace_alter_table(altered_table, ha_alter_info, table_def, altered_table_def)) { goto rollback; ``` Step 2: ```sql set global innodb_print_ddl_logs = ON; We hope that this will be helpful .......
[7 Oct 12:27]
MySQL Verification Team
Hi, You can also find many examples in mysql-test/t/ directory with DEBUG_SYNC. Our fist comment had a mistake. It is not DBUG_SYNC, but DEBUG_SYNC. There are also other commands, like DBUG_EXECUTE_IF() and others that you can find there .......
[7 Oct 14:55]
zkong kong
the patch and the mtr
Attachment: bug116305.tar.gz (application/x-gzip, text), 1.25 KiB.
[7 Oct 15:43]
MySQL Verification Team
Hi Mr. kong, Sorry, but we fail to see in your attached file any DBUG* or DEBUG* commands in your test case. That is what we explicitly asked for. Many thanks in advance .......
[7 Oct 16:05]
MySQL Verification Team
Hi Mr. kong, Can you also send us a result file ??? Our test is simply always failing ........
[7 Oct 23:20]
zkong kong
Hi: Without fix the mtr is always fail at the assert at Block_hint::buffer_fix_block_if_still_valid + ut_ad(m_page_id == m_block->page.id); pcur expects restore the block which was stored and the assert above must be satisfied.
[8 Oct 11:13]
zkong kong
Hi: Use patch of zhai weixiang can solve the assert fail, Is the fix ok, please review it, thx very much! The attachment above has the patch and the result file of the mtr.
[9 Oct 9:52]
MySQL Verification Team
Hi Mr. kong, Your patch for the file buf0block_hint.cc contains some changes at our source code. Are those there for repeating the test case or for fixing the bug. Many thanks in advance.
[9 Oct 11:53]
zkong kong
Hi: The problem is block_hint may buffer_fix the block which is not expected, the patch has the the assert to repeat the problem and the fix, please confirm the changes: diff --git a/storage/innobase/buf/buf0block_hint.cc b/storage/innobase/buf/buf0block_hint.cc index 4a2500b7409..2b764a6e63f 100644 --- a/storage/innobase/buf/buf0block_hint.cc +++ b/storage/innobase/buf/buf0block_hint.cc @@ -70,13 +70,25 @@ void Block_hint::buffer_fix_block_if_still_valid() { if (m_block != nullptr) { const buf_pool_t *const pool = buf_pool_get(m_page_id); rw_lock_t *latch = buf_page_hash_lock_get(pool, m_page_id); + DBUG_EXECUTE_IF("bug116305", { if (m_page_id.space() == 6 && m_page_id.page_no() == 4) sleep(10); }); // ... repeating test (see mtr) rw_lock_s_lock(latch, UT_LOCATION_HERE); /* If not own buf_pool_mutex, page_hash can be changed. */ latch = buf_page_hash_lock_s_confirm(latch, pool, m_page_id); - if (buf_is_block_in_instance(pool, m_block) && - m_page_id == m_block->page.id && - buf_block_get_state(m_block) == BUF_BLOCK_FILE_PAGE) { - buf_block_buf_fix_inc(m_block, UT_LOCATION_HERE); + if (buf_is_block_in_instance(pool, m_block)) { + buf_block_t *ptr = m_block; // ... fix the bug add a lock to check page_id and state atomicly + buf_page_mutex_enter(m_block); // ... fix the bug + if (m_page_id == m_block->page.id) { + DBUG_EXECUTE_IF("bug116305", { if (m_page_id.space() == 6 && m_page_id.page_no() == 4) sleep(10); }); // ... repeating test (see mtr) + if (buf_block_get_state(m_block) == BUF_BLOCK_FILE_PAGE) { + ut_ad(m_page_id == m_block->page.id); // ... repeating test (see mtr), if the block not exprected process it will cause error + buf_block_buf_fix_inc(m_block, UT_LOCATION_HERE); + } else { + clear(); + } + } else { + clear(); + } + buf_page_mutex_exit(ptr); // ... fix the bug } else { clear(); } diff --git a/storage/innobase/buf/buf0lru.cc b/storage/innobase/buf/buf0lru.cc index 3fe3a8654e4..03aedf9a0ad 100644 --- a/storage/innobase/buf/buf0lru.cc +++ b/storage/innobase/buf/buf0lru.cc @@ -2269,7 +2269,7 @@ static bool buf_LRU_block_remove_hashed(buf_page_t *bpage, bool zip, ut_ad(mutex_own(&buf_pool->LRU_list_mutex)); rw_lock_x_unlock(hash_lock); mutex_exit(&((buf_block_t *)bpage)->mutex); - + DBUG_EXECUTE_IF("bug116305", { if (bpage->id.space()==6 && bpage->id.page_no()==4) sleep(10); }); // ... repeating test (see mtr) if (zip && bpage->zip.data) { /* Free the compressed page. */ void *data = bpage->zip.data;
[9 Oct 12:03]
MySQL Verification Team
HI Mr. kong, We are slightly puzzled here. You claim that file bug116305.tar.gz contains now both the code for the test case only. However, you also claim that it contains a fix for the bug. Can you send us just a patch that is necessary for the test. Thanks in advance.
[9 Oct 12:15]
zkong kong
Hi: Because the assert if without fix it crash stably (run the mtr), if only the test code, please double check it again, thx very much! diff --git a/storage/innobase/buf/buf0block_hint.cc b/storage/innobase/buf/buf0block_hint.cc index 4a2500b7409..66e22a9dd01 100644 --- a/storage/innobase/buf/buf0block_hint.cc +++ b/storage/innobase/buf/buf0block_hint.cc @@ -70,13 +70,19 @@ void Block_hint::buffer_fix_block_if_still_valid() { if (m_block != nullptr) { const buf_pool_t *const pool = buf_pool_get(m_page_id); rw_lock_t *latch = buf_page_hash_lock_get(pool, m_page_id); + DBUG_EXECUTE_IF("bug116305", { if (m_page_id.space() == 6 && m_page_id.page_no() == 4) sleep(10); }); // wait other session free the hint block rw_lock_s_lock(latch, UT_LOCATION_HERE); /* If not own buf_pool_mutex, page_hash can be changed. */ latch = buf_page_hash_lock_s_confirm(latch, pool, m_page_id); if (buf_is_block_in_instance(pool, m_block) && - m_page_id == m_block->page.id && - buf_block_get_state(m_block) == BUF_BLOCK_FILE_PAGE) { - buf_block_buf_fix_inc(m_block, UT_LOCATION_HERE); + m_page_id == m_block->page.id) { + DBUG_EXECUTE_IF("bug116305", { if (m_page_id.space() == 6 && m_page_id.page_no() == 4) sleep(10); }); // if check page_id and state not atamiclly the page of the block may freed + if (buf_block_get_state(m_block) == BUF_BLOCK_FILE_PAGE) { + ut_ad(m_page_id == m_block->page.id); // with mtr case it must be failed here and crash + buf_block_buf_fix_inc(m_block, UT_LOCATION_HERE); + } else { + clear(); + } } else { clear(); } diff --git a/storage/innobase/buf/buf0lru.cc b/storage/innobase/buf/buf0lru.cc index 3fe3a8654e4..03aedf9a0ad 100644 --- a/storage/innobase/buf/buf0lru.cc +++ b/storage/innobase/buf/buf0lru.cc @@ -2269,7 +2269,7 @@ static bool buf_LRU_block_remove_hashed(buf_page_t *bpage, bool zip, ut_ad(mutex_own(&buf_pool->LRU_list_mutex)); rw_lock_x_unlock(hash_lock); mutex_exit(&((buf_block_t *)bpage)->mutex); - + DBUG_EXECUTE_IF("bug116305", { if (bpage->id.space()==6 && bpage->id.page_no()==4) sleep(10); }); // free the block_hint block and let it's check logic failed if (zip && bpage->zip.data) { /* Free the compressed page. */ void *data = bpage->zip.data;
[9 Oct 13:41]
MySQL Verification Team
Hi Mr. kong, Thank you for your bug report. After applying your patch, building a debug 9.0.1 binary and running a test, we have repeated the problem. This is now a verified bug report for version 9.0. Thanks again.