Bug #117691 MySQL 8.4 and 9 are not taking advantage of the Linux Page Cache (contrarily to 8.0).
Submitted: 12 Mar 19:12 Modified: 13 Mar 5:07
Reporter: Jean-François Gagné Email Updates:
Status: Analyzing Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:8.4.0, 8.4.4, 9.2.0 OS:Linux
Assigned to: MySQL Verification Team CPU Architecture:Any
Tags: regression

[12 Mar 19:12] Jean-François Gagné
Description:
Hi,

when doing IO tests with MySQL 9.2.0, I am not able to simulate fast IOs that I was able to simulate with previous versions of MySQL.  The way I simulate such fast IO is with a small InnoDB Buffer Pool and an ibd file in the Linux Page Cache.

I trace this down to MySQL 8.4.0.  In MySQL 8.0.41, I am able to simulate fast IOs, but with 9.2.0, 8.4.4 and 8.4.0, I am not.  These tests are done on the same Debian vm in AWS with gp3 disks.  See How to repeat for details.

Setting this as S2 / Serious, because a sub-optimal configuration of MySQL 8.0 (small buffer pool) on a server with a lot of RAM will lead to reduced performance after upgrade.  For the same reason, I tagged as "Regression".  There might be an argument for setting this as a S5 / Performance problem, but the regression makes me think S2 is more appropriate.

This might be caused by new InnoDB defaults in 8.4, but I have not checked in more detail.  Because of the unknown source, setting this as Category MySQL Server.

Many thanks for looking into this,

Jean-François Gagné

How to repeat:
######################
# Environment details.

# Hardware details.
$ cat /proc/cpuinfo  | grep proc | wc -l
8

$ cat /proc/meminfo  | grep MemTotal
MemTotal:       32346516 kB

# OS details.
$ cat /etc/debian_version 
12.7

$ uname -a
Linux ip-172-31-0-54 6.1.0-26-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.112-1 (2024-09-30) x86_64 GNU/Linux

##########################
# Tests with MySQL 8.0.41.

dbdeployer deploy single mysql_8.0.41

# Filling one table with data.
# Setting things for having a 3 GB table with 4 rows per 16 KB InnoDB Page.
# The pv are a trick to time command execution.
nb_rows=$((3*1024*1024*1024 / (16*1024) * 4))
{
  ./use <<< "
     CREATE DATABASE test_jfg;
     CREATE TABLE test_jfg.t (id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY)"

  seq 1 $nb_rows |
    awk '{print "(null)"}' |
    tr " " "," | paste -s -d "$(printf ',%.0s' {1..100})\n" |
    sed -e 's/.*/INSERT INTO t values &;/' |
    ./use test_jfg | pv -t

  { echo "ALTER TABLE t ADD COLUMN c0 CHAR(200) DEFAULT ''"
           seq -f " ADD COLUMN c%.0f CHAR(240) DEFAULT ''" 1 15
  } | paste -s -d "," | ./use test_jfg

  ./use test_jfg <<< "ALTER TABLE t FORCE"      | pv -t
  ./use test_jfg <<< "FLUSH TABLE t FOR EXPORT" | pv -t

  ls -lh data/test_jfg/t.ibd
}

# Below is the output of above.
0:00:39
0:03:06
0:00:00
-rw-r----- 1 jgagne jgagne 4.7G Mar 12 15:07 data/test_jfg/t.ibd

# Generating load.
while sleep 0.1; do ./use -N test_jfg <<< "SET @i = ROUND(RAND() * $nb_rows); SELECT * from t where id = @i;"; done

# Monitoring IOs (columns are: date, nb_ios, total_wait_usec, and avg_wait_usec).
sql4="select COUNT_STAR, SUM_TIMER_WAIT/1000/1000"; \
sql4="$sql4 from file_summary_by_event_name where EVENT_NAME = 'wait/io/file/innodb/innodb_data_file'"; \
while sleep 1; do date; ./use -N performance_schema <<< "$sql4"; done |
  stdbuf -oL paste -s -d " \n" | awk -W interactive '{
    aa=$7-a; bb=$8-b;
    a =$7;   b =$8;
    NF=6; print $0, aa, bb, bb/aa}' | tail -n +2

# Making sure nothing is in the page cache.
sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"

# Without a cache, we have avg_wait_usec ~600 usec.
Wed Mar 12 19:00:50 UTC 2025 16 9631.26 601.954
Wed Mar 12 19:00:51 UTC 2025 15 9172.63 611.508
Wed Mar 12 19:00:52 UTC 2025 14 8980.29 641.449

# Loading the table in cache.
pv -etbr data/test_jfg/t.ibd > /dev/null
4.63GiB 0:00:36 [ 129MiB/s]            

# When the table is in cache, we have fast IOs (~7 usec).
Wed Mar 12 19:02:05 UTC 2025 15 106.806 7.1204
Wed Mar 12 19:02:06 UTC 2025 17 107.185 6.30497
Wed Mar 12 19:02:07 UTC 2025 15 109.916 7.32773

# When repeating above with 8.4.4 (or 8.4.0 or 9.2.0),
#   I have similar uncached IOs.
Wed Mar 12 19:04:19 UTC 2025 10 6431.71 643.171
Wed Mar 12 19:04:20 UTC 2025 17 10822.1 636.595
Wed Mar 12 19:04:21 UTC 2025 12 6994.24 582.853

# But I do not have fast cached IOs.
Wed Mar 12 19:05:35 UTC 2025 13 7872.91 605.608
Wed Mar 12 19:05:36 UTC 2025 15 8442.59 562.839
Wed Mar 12 19:05:37 UTC 2025 13 7935.47 610.42

# And I know the table is in cache because below.
date; pv -etbr data/test_jfg/t.ibd > /dev/null
Wed Mar 12 19:05:35 UTC 2025
4.63GiB 0:00:00 [8.89GiB/s]

# EOF.