| Bug #120105 | mysql crash on fil_report_invalid_page_access_low | ||
|---|---|---|---|
| Submitted: | 19 Mar 7:42 | Modified: | 10 Apr 12:23 |
| Reporter: | yangyang wang | Email Updates: | |
| Status: | Can't repeat | Impact on me: | |
| Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
| Version: | 8.0.22 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[19 Mar 7:44]
yangyang wang
The data is not damaged, and I can query the data normally after the restart.
[19 Mar 7:46]
yangyang wang
Strangely, I found that the space accessed by the worker thread is not the same as the table accessed by the leader thread, and the SQL in the leader thread does not seem to need to go through the PQ logic (currently, PQ only supports count(*) and check table?).
m_leader_thd = 0x7f56fd48d800}
(gdb) p (THD*) 0x7f56fd48d800
$28 = (THD *) 0x7f56fd48d800
(gdb) p ((THD*)0x7f56fd48d800)->m_query_string
$29 = {str = 0x7f570ab9a030 "ANALYZE TABLE t33 UPDATE HISTOGRAM ON c12", length = 41}
(gdb) p page_id
$5 = (const page_id_t &) @0x7f5536fd7a68: {m_space = 74768, m_page_no = 1768842857}
(gdb) p *space
$12 = {m_last_extended = {m_start = {__d = {__r = 32791027984465177}}}, m_undo_extend = 1024, name = 0x7f570bc9c460 "db21146ye_6038_180/t33", id = 74768,
[19 Mar 8:26]
yangyang wang
Sorry,The space is the same, and the operation performed is 'ANALYZE TABLE t33 UPDATE HISTOGRAM ON c12'. During the read, the page number is out of the range.
[20 Mar 6:52]
Mayank Prasad
Hi yangyang wang, Thanks for filing the bug. However, "use ddlcheck tool and restart mysqld random" are not very clear reproduction steps. Is there some other tool you are using? Could you please provide more specific reproduction steps.
[20 Mar 7:18]
yangyang wang
We also used other tools, such as pstress. Currently, this is the only tool where this issue has been encountered, and not just once, but multiple times when the PQ scenario is enabled. mysqld.err.2:2026-03-15T10:57:23.758574+08:00 0 139736539002624 [&Parallel_reade] [ERROR] [MY-012153] [InnoDB] Trying to access page number 1768842857 in space 207060, space name db21146ye_228_72/t3, which is outside the tablespace bounds. Byte offset 0, len 16384, i/o type read. If you get this error at mysqld startup, please check that your my.cnf matches the ibdata files that you have in the MySQL server. mysqld.err.3:2026-03-15T10:45:50.526516+08:00 0 140205386364672 [&Parallel_reade] [ERROR] [MY-012153] [InnoDB] Trying to access page number 1768842857 in space 178665, space name db21146ye_8081_334/t52, which is outside the tablespace bounds. Byte offset 0, len 16384, i/o type read. If you get this error at mysqld startup, please check that your my.cnf matches the ibdata files that you have in the MySQL server. mysqld.err.5:2026-03-15T09:49:36.189251+08:00 0 139924249368320 [&Parallel_reade] [ERROR] [MY-012153] [InnoDB] Trying to access page number 1768842857 in space 62312, space name db21146ye_6038_7/t24, which is outside the tablespace bounds. Byte offset 0, len 16384, i/o type read. If you get this error at mysqld startup, please check that your my.cnf matches the ibdata files that you have in the MySQL server. mysqld.err.5:2026-03-15T09:56:27.924381+08:00 0 140003971548928 [&Parallel_reade] [ERROR] [MY-012153] [InnoDB] Trying to access page number 1768842857 in space 74768, space name db21146ye_6038_180/t33, which is outside the tablespace bounds. Byte offset 0, len 16384, i/o type read. If you get this error at mysqld startup, please check that your my.cnf matches the ibdata files that you have in the MySQL server.
[24 Mar 5:02]
Mayank Prasad
Hi yangyang wang, I am not aware of this ddlcheck tool. Could you please gprovideive detailed reproduction steps for the issue. Thanks!
[25 Mar 8:20]
yangyang wang
set innodb_buffer_pool_size to 10M,and run script, we can get the crash:
#!/bin/bash
export taurus_root_path=/opt/workdir/w00574625/mysql-root
export taurus_install_path=/home/ci/install
# 10 张表的名称(根据实际修改)
TABLES=(t3 t4 t5 t6 t7 t8 t9 t10 t1 t2)
MYSQL_DB="test"
# ==================== 函数定义 ====================
# 执行 ANALYZE TABLE 的无限循环
analyze_table() {
while true; do
# 随机选择一张表
table=${TABLES[$RANDOM % ${#TABLES[@]}]}
${taurus_install_path}/sql/bin/mysql -h 127.0.0.1 -u root -P 3306 -p123456 -D"$MYSQL_DB"\
-e "ANALYZE TABLE $table UPDATE HISTOGRAM ON c7;" > /dev/null 2>&1
# 如需降低频率,可取消注释下面一行
# sleep 0.1
done
}
# 执行 SELECT COUNT(*) 的无限循环
select_count() {
while true; do
# 随机选择一张表
table=${TABLES[$RANDOM % ${#TABLES[@]}]}
${taurus_install_path}/sql/bin/mysql -h 127.0.0.1 -u root -P 3306 -p123456 -D"$MYSQL_DB" \
-e "SELECT COUNT(*) FROM $table;" > /dev/null 2>&1
# 如需降低频率,可取消注释下面一行
# sleep 0.1
done
}
alter_table() {
while true; do
# 随机选择一张表
table=${TABLES[$RANDOM % ${#TABLES[@]}]}
${taurus_install_path}/sql/bin/mysql -h 127.0.0.1 -u root -P 3306 -p123456 -D"$MYSQL_DB" \
-e "optimize table $table;" > /dev/null 2>&1
# 如需降低频率,可取消注释下面一行
# sleep 0.1
done
}
crud_view() {
while true; do
# 随机选择一张表
table=${TABLES[$RANDOM % ${#TABLES[@]}]}
${taurus_install_path}/sql/bin/mysql -h 127.0.0.1 -u root -P 3306 -p123456 -D"$MYSQL_DB" \
-e "CREATE VIEW $table_v AS SELECT c2 AS ca1, c5 AS ca2, c7 AS ca3 FROM $table WHERE ((! c7)); select * from $table_v;drop view $table_v;" > /dev/null 2>&1
# 如需降低频率,可取消注释下面一行
# sleep 0.1
done
}
# ==================== 清理函数 ====================
cleanup() {
echo -e "\nCaught signal, killing all background jobs..."
# 杀死所有子进程
kill $(jobs -p) 2>/dev/null
wait
exit
}
# ==================== 主逻辑 ====================
# 捕获退出信号
trap cleanup SIGINT SIGTERM
# 启动 30 个 ANALYZE 进程
for i in {1..30}; do
analyze_table &
done
# 启动 20 个 SELECT 进程
for i in {1..20}; do
select_count &
done
# 启动 20 个 SELECT 进程
for i in {1..5}; do
alter_table &
done
for i in {1..5}; do
crud_view &
done
echo "Started 30 ANALYZE and 20 SELECT processes. Press Ctrl+C to stop."
# 等待所有后台进程(永久等待)
wait
[26 Mar 7:22]
yangyang wang
CREATE TABLE `t3` (
`c7` float DEFAULT NULL,
`c4` decimal(10,0) unsigned zerofill DEFAULT NULL,
`c2` blob,
`c5` smallint(5) unsigned zerofill DEFAULT NULL,
UNIQUE KEY `c7` (`c7`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci STATS_PERSISTENT=1
prepare data:
CREATE PROCEDURE insert_t3_1m()
BEGIN
DECLARE i INT DEFAULT 1;
DECLARE max_rows INT DEFAULT 1000000; -- 目标行数:100万
DECLARE batch_size INT DEFAULT 1000; -- 每1000行提交一次
DECLARE rand_c4 DECIMAL(10,0);
DECLARE rand_c2 BLOB;
DECLARE rand_c5 SMALLINT UNSIGNED;
SET autocommit = 0; -- 关闭自动提交,手动控制事务
WHILE i <= max_rows DO
-- 生成随机数据
SET rand_c4 = FLOOR(RAND() * 10000000000); -- 0 ~ 9,999,999,999
SET rand_c2 = RANDOM_BYTES(100); -- 100字节随机二进制数据
SET rand_c5 = FLOOR(RAND() * 65536); -- 0 ~ 65,535
-- 插入数据,c7 使用循环变量 i 确保唯一性
INSERT INTO t3 (c7, c4, c2, c5) VALUES (i, rand_c4, rand_c2, rand_c5);
-- 每 batch_size 行提交一次,避免事务过大
IF i % batch_size = 0 THEN
COMMIT;
END IF;
SET i = i + 1;
END WHILE;
COMMIT; -- 提交剩余未提交的行
SET autocommit = 1; -- 恢复自动提交
END //
DELIMITER ;
create other tables:(t1~t10)
create table t1 select * from t3;
[10 Apr 11:06]
Mayank Prasad
I tried it multiple times with the scripts given but wasn't able to repro it. I tried both on release and debug build but no luck.
[10 Apr 12:23]
MySQL Verification Team
Hello, I've already tried repeating this on 8.0.22 and wasn't able to after an hour+ of running. Please check if 8.4.8 or 9.6.0 is affected by same bug, and if it is reopen this report. Thanks,
[17 Apr 17:07]
Jean-François Gagné
I was not able to reproduce either, but I did not run for very long (45 minutes with 66 restarts): how long does it take to get a crash ? Our test environment might also be different (MySQL settings, hardware, ...). Also, this was open for 8.0.22 (and I tested this version), but we should move tests to more recent 8.0.45 and 8.4.8.
I am doing tests on a m6id.8xlarge AWS VM (local SSDs, 128 vCPU and 64 cores), starting with default MySQL settings. In addition to changing the InnoDB Buffer Pool Size, are you changing other parameters ? Because this looks related to Parallel Reads, I tried with innodb_parallel_read_threads=16, but still no crash.
> when the PQ scenario is enabled
Are you doing anything specific to enable PQ / parallel query ? How many CPUs do you have ? What type of disks do you have (local SSDs, network SSDs, ...) ?
Below are my test scripts. I modified / fixed crud_view because "CREATE VIEW $table_v" did not work (also dropped the view in another command to be resilient of a mysqld restart leaving a view present), implemented some "stop" features (touch run), and implemented some noise reduction (touch run_stop). Note that I am getting some errors, two listed below (single-part unique index for t3, and deadlocks). Also sometimes, mysqld stop hangs and I have to kill -9 (in restart_mysqld).
---
test_jfg.t3 histogram Error The column 'c7' is covered by a single-part unique index.
test_jfg.t9 histogram Error Deadlock found when trying to get lock; try restarting transaction
---
{
v=mysql_8.0.22; d=${v//./_}
dbda="-c innodb_buffer_pool_chunk_size=1048576 -c innodb_buffer_pool_instances=1"
dbdeployer deploy single $v $dbda| pv -tN dbdepl. > /dev/null
cd ~/sandboxes/msb_$d
}
{
./use <<< "CREATE DATABASE test_jfg"
./use test_jfg <<< "
CREATE TABLE t3 (
c7 float DEFAULT NULL,
c4 decimal(10,0) unsigned zerofill DEFAULT NULL,
c2 blob,
c5 smallint(5) unsigned zerofill DEFAULT NULL,
UNIQUE KEY c7 (c7)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci STATS_PERSISTENT=1;
DELIMITER //
CREATE PROCEDURE insert_t3_1m()
BEGIN
DECLARE i INT DEFAULT 1;
DECLARE max_rows INT DEFAULT 1000000;
DECLARE batch_size INT DEFAULT 1000;
DECLARE rand_c4 DECIMAL(10,0);
DECLARE rand_c2 BLOB;
DECLARE rand_c5 SMALLINT UNSIGNED;
SET autocommit = 0;
WHILE i <= max_rows DO
SET rand_c4 = FLOOR(RAND() * 10000000000); -- 0 ~ 9,999,999,999
SET rand_c2 = RANDOM_BYTES(100);
SET rand_c5 = FLOOR(RAND() * 65536); -- 0 ~ 65,535
INSERT INTO t3 (c7, c4, c2, c5) VALUES (i, rand_c4, rand_c2, rand_c5);
IF i % batch_size = 0 THEN
COMMIT;
END IF;
SET i = i + 1;
END WHILE;
COMMIT;
SET autocommit = 1;
END //"
./use test_jfg <<< "CALL insert_t3_1m()" | pv -tN call
for t in t4 t5 t6 t7 t8 t9 t10 t1 t2; do
./use test_jfg <<< "create table $t select * from t3" | pv -tN $t
done
}
TABLES=(t3 t4 t5 t6 t7 t8 t9 t10 t1 t2)
analyze_table() {
while test -e run; do
test -e run_stop && { sleep 1; continue; }
./use test_jfg -N -e "ANALYZE TABLE ${TABLES[$RANDOM % ${#TABLES[@]}]} UPDATE HISTOGRAM ON c7"
done
}
select_count() {
while test -e run; do
test -e run_stop && { sleep 1; continue; }
./use test_jfg -N -e "SELECT COUNT(*) FROM ${TABLES[$RANDOM % ${#TABLES[@]}]}"
done
}
alter_table() {
while test -e run; do
test -e run_stop && { sleep 1; continue; }
./use test_jfg -N -e "optimize table ${TABLES[$RANDOM % ${#TABLES[@]}]}"
done
}
crud_view() {
local view=view_$1
while test -e run; do
test -e run_stop && { sleep 1; continue; }
local table=${TABLES[$RANDOM % ${#TABLES[@]}]}
local sql="CREATE VIEW $view AS SELECT c2 AS ca1, c5 AS ca2, c7 AS ca3 FROM $table WHERE ((! c7));"
./use test_jfg -N -e "$sql select * from $view;"
./use test_jfg -N -e "drop view $view;"
done
}
restart_mysqld() {
local i=1 pid
while test -e run; do
sleep $(($RANDOM % 60)); test -e run || break
touch run_stop; pid=$(cat data/*.pid); pkill -9 mysqld_safe; kill $pid
{ for j in {0..9}; do sleep 1; ps -p $pid > /dev/null || break; done
ps -p $pid > /dev/null && kill -9 $pid
while ps -p $pid > /dev/null; do sleep 0.1; done
} | pv -tN stop$i
./start; rm run_stop
i=$(($i+1))
done
}
run_test() {
touch run; rm run_stop
for i in {1..30}; do analyze_table & done
for i in {1..20}; do select_count & done
for i in {1..5}; do alter_table & done
for i in {1..5}; do crud_view $i & done
restart_mysqld &
}
./use <<< "set persist innodb_buffer_pool_size = 1024*1024*10"
run_test | pv -tN run_test
./use <<< "set persist innodb_parallel_read_threads = 16"
run_test | pv -tN run_test
[17 Apr 17:11]
Jean-François Gagné
> I am doing tests on a m6id.8xlarge AWS VM (local SSDs, 128 vCPU and 64 cores) Sorry, above is wrong. It is a m6id.32xlarge, other specs are ok. Not to self: this instance is 7.5936 USD per Hour (and this is why I did not leave tests running for very long).

Description: mysql crash in Parallel_reader: 2026-03-15T10:57:23.739696+08:00 0 139736555788032 [&Parallel_reade] [System] [MY-011825] [InnoDB] &Parallel_reader::worker start. 2026-03-15T10:57:23.745280+08:00 0 139736555788032 [&Parallel_reade] [System] [MY-011825] [InnoDB] &Parallel_reader::worker start. 2026-03-15T10:57:23.752519+08:00 0 139736555788032 [&Parallel_reade] [System] [MY-011825] [InnoDB] &Parallel_reader::worker start. 2026-03-15T10:57:23.755435+08:00 0 139736539002624 [&Parallel_reade] [System] [MY-011825] [InnoDB] &Parallel_reader::worker start. 2026-03-15T10:57:23.758574+08:00 0 139736539002624 [&Parallel_reade] [ERROR] [MY-012153] [InnoDB] Trying to access page number 1768842857 in space 207060, space name db21146ye_228_72/t3, which is outside the tablespace bounds. Byte offset 0, len 16384, i/o type read. If you get this error at mysqld startup, please check that your my.cnf matches the ibdata files that you have in the MySQL server. 2026-03-15T10:57:23.758604+08:00 0 139736539002624 [&Parallel_reade] [ERROR] [MY-012154] [InnoDB] Server exits. 2026-03-15T10:57:23.758623+08:00 0 139736539002624 [&Parallel_reade] [ERROR] [MY-013183] [InnoDB] Assertion failure: fil0fil.cc:8437 thread 139736539002624 the stack is: (gdb) bt #0 0x00007f5c6c6f4a01 in pthread_kill () from /lib64/libpthread.so.0 #1 0x000000000140f2db in my_write_core (sig=6) at ../../../include/my_thread.h:91 #2 handle_fatal_signal (sig=6) at ../../sql/signal_handler.cc:171 #3 handle_fatal_signal (sig=6) at ../../sql/signal_handler.cc:75 #4 <signal handler called> #5 0x00007f5c69dd64e7 in raise () from /lib64/libc.so.6 #6 0x00007f5c69dd7bd8 in abort () from /lib64/libc.so.6 #7 0x000000000251a056 in ut_dbg_assertion_failed (expr=0x0, file=0x3b71090 "../../../storage/innobase/fil/fil0fil.cc", line=8437) at ../../../storage/innobase/ut/ut0dbg.cc:127 #8 0x0000000003109684 in fil_report_invalid_page_access_low(unsigned int, unsigned int, char const*, unsigned long, unsigned long, bool, int) [clone .constprop.0] (block_offset=1768842857, space_id=74768, space_name=0x7f570bc9c460 "db21146ye_6038_180/t33", byte_offset=0, len=16384, is_read=<optimized out>, line=<optimized out>) at ../../../storage/innobase/fil/fil0fil.cc:8437 #9 0x0000000006ce05db in Fil_shard::prepare_fil_space(IORequest const&, bool, page_id_t const&, buf_page_t*, sal_read_out_t*, unsigned long, unsigned long, fil_space_t*&, fil_node_t*&) [clone .cold.0] () at ../../../storage/innobase/fil/fil0fil.cc:8681 #10 0x00000000066c2bc9 in Fil_shard::do_io (this=0x7f5c4d1fd920, type=..., sync=<optimized out>, page_id=..., page_size=..., byte_offset=byte_offset@entry=0, len=16384, buf=0x7f5c0a5dc000, buf_page=0x7f5c07253a00, page_context=0x7f5536fd7430, ndp=0x0) at ../../../storage/innobase/fil/fil0fil.cc:8954 #11 0x0000000006669033 in fil_io (ndp=0x0, page_context=0x7f5536fd7430, message=0x7f5c07253a00, buf=0x7f5c0a5dc000, len=<optimized out>, byte_offset=0, page_size=..., page_id=..., sync=<optimized out>, type=...) at ../../../storage/innobase/fsp/fsp0fsp.cc:272 #12 buf_read_page_low (err=0x7f5536fd766c, sync=<optimized out>, type=0, mode=<optimized out>, page_id=..., page_size=..., unzip=false, anchor_page_p=0x7f5536fd7b88, page_load=NORMAL, is_scan=false) at ../../../storage/innobase/buf/buf0rea.cc:182 #13 0x0000000006668aaf in Buf_fetch<Buf_fetch_other>::read_page (this=0x7f5536fd7b20) at ../../../storage/innobase/buf/buf0buf.cc:4494 #14 0x00000000066824a9 in get (block=<optimized out>, this=<optimized out>) at ../../../storage/innobase/buf/buf0buf.cc:4084 #15 Buf_fetch<Buf_fetch_other>::single_page (this=0x7f5536fd7b20) at ../../../storage/innobase/buf/buf0buf.cc:4716 #16 0x00000000066114fb in buf_page_get_gen (dirty_with_no_latch=false, mtr=<optimized out>, line=<optimized out>, file=<optimized out>, mode=<optimized out>, guess=<optimized out>, rw_latch=1, page_size=..., page_id=...) at ../../../storage/innobase/include/ibuf0ibuf.ic:126 #17 buf_page_get_gen (dirty_with_no_latch=false, mtr=<optimized out>, line=<optimized out>, file=<optimized out>, mode=SCAN, guess=<optimized out>, rw_latch=1, page_size=..., page_id=...) at ../../../storage/innobase/buf/buf0buf.cc:4890 #18 btr_cur_search_to_nth_level (index=0x7f573aa09f98, level=<optimized out>, tuple=<optimized out>, mode=<optimized out>, latch_mode=<optimized out>, cursor=<optimized out>, has_search_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, btr_ndp=<optimized out>) at ../../../storage/innobase/btr/btr0cur.cc:1482 #19 0x0000000006799475 in btr_pcur_t::open_no_init_low (this=this@entry=0x7f571d957698, index=index@entry=0x7f573aa09f98, tuple=tuple@entry=0x7f573abe6458, mode=PAGE_CUR_LE, latch_mode=latch_mode@entry=1, has_search_latch=has_search_latch@entry=0, mtr=<optimized out>, ndp_work=<optimized out>, restore=<optimized out>, file=<optimized out>, line=<optimized out>) at ../../../storage/innobase/include/btr0pcur.h:831 #20 0x0000000006608d36 in btr_pcur_t::restore_position (this=0x7f571d957698, latch_mode=1, mtr=0x7f5536fd8d60, file=0x43907a8 "../../../storage/innobase/row/row0pread.cc", line=337) at ../../../storage/innobase/btr/btr0pcur.cc:300 #21 0x00000000069e0070 in restore_position (this=0x7f5536fd8d40) at ../../../storage/innobase/row/row0pread.cc:663 #22 Parallel_reader::Ctx::traverse (this=0x7f5721ceb1a0) at ../../../storage/innobase/row/row0pread.cc:664 #23 0x00000000069a9f2b in Parallel_reader::worker (this=0x7f570d490840, thread_ctx=<optimized out>) at ../../../storage/innobase/row/row0pread.cc:893 #24 0x00000000069ef88c in __invoke_impl<void, void (Parallel_reader::*&)(Parallel_reader::Thread_ctx*), Parallel_reader*&, Parallel_reader::Thread_ctx*&> (__t=<synthetic pointer>, __f=<synthetic pointer>) at ../../../storage/innobase/include/ut0ut.h:699 #25 __invoke<void (Parallel_reader::*&)(Parallel_reader::Thread_ctx*), Parallel_reader*&, Parallel_reader::Thread_ctx*&> (__fn=<synthetic pointer>) at /data/fuxi_ci_workspace/128467191/Taurus_x86/src/tools/gcc-10/include/c++/10/bits/invoke.h:95 #26 __call<void, 0, 1> (__args=<optimized out>, this=<synthetic pointer>) at /data/fuxi_ci_workspace/128467191/Taurus_x86/src/tools/gcc-10/include/c++/10/functional:416 #27 operator()<> (this=<synthetic pointer>) at /data/fuxi_ci_workspace/128467191/Taurus_x86/src/tools/gcc-10/include/c++/10/functional:499 #28 Runnable::operator()<void (Parallel_reader::*)(Parallel_reader::Thread_ctx*), Parallel_reader*, Parallel_reader::Thread_ctx*>(void (Parallel_reader::*&&)(Parallel_reader::Thread_ctx*), Parallel_reader*&&, Parallel_reader::Thread_ctx*&&) (this=0x7f56fe93b108, f=<optimized out>) at ../../../storage/innobase/include/os0thread-create.h:116 #29 0x0000000006907dd0 in execute_native_thread_routine () at ../../../storage/innobase/include/mach0data.ic:629 #30 0x00007f5c6c6efe15 in ?? () from /lib64/libpthread.so.0 #31 0x00007f5c69ea1fed in clone () from /lib64/libc.so.6 How to repeat: use ddlcheck tool and restart mysqld random