Description:
Hi,
when I saw Bug#107059 in action, I also saw the magic number 2147483647 (2^31-1) in I_S.INNODB_CMP. This lead me to look at limit testing there, and it is relatively easy to reached max value for uncompress_ops, see How to repeat for details.
Looking at the table definition, the behavior is not surprising as uncompress_ops is declared as an int.
mysql [localhost:8028] {msandbox} ((none)) > show create table information_schema.INNODB_CMP\G
*************************** 1. row ***************************
Table: INNODB_CMP
Create Table: CREATE TEMPORARY TABLE `INNODB_CMP` (
`page_size` int NOT NULL DEFAULT '0',
`compress_ops` int NOT NULL DEFAULT '0',
`compress_ops_ok` int NOT NULL DEFAULT '0',
`compress_time` int NOT NULL DEFAULT '0',
`uncompress_ops` int NOT NULL DEFAULT '0',
`uncompress_time` int NOT NULL DEFAULT '0'
) ENGINE=MEMORY DEFAULT CHARSET=utf8mb3
1 row in set (0.00 sec)
Even though this behavior is expected for a int column, I think this qualifies as a bug because monitoring looking at this counter will conclude nothing is happening when the counter is not increasing because max value is reached. Also, as shown in How to repeat, it is relatively easy to reach the max value on a busy system. A workaround is to reset the counter by querying INNODB_CMP_RESET, but having this value as a bigint would be the right way to fix this.
I guess 5.7 also affected because it has the same table structure (see below), but not actually tested (How to repeat below is for 8.0.28).
mysql [localhost:5737] {msandbox} ((none)) > show create table information_schema.INNODB_CMP\G
*************************** 1. row ***************************
Table: INNODB_CMP
Create Table: CREATE TEMPORARY TABLE `INNODB_CMP` (
`page_size` int(5) NOT NULL DEFAULT '0',
`compress_ops` int(11) NOT NULL DEFAULT '0',
`compress_ops_ok` int(11) NOT NULL DEFAULT '0',
`compress_time` int(11) NOT NULL DEFAULT '0',
`uncompress_ops` int(11) NOT NULL DEFAULT '0',
`uncompress_time` int(11) NOT NULL DEFAULT '0'
) ENGINE=MEMORY DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
Many thanks for looking into this, Jean-François Gagné
How to repeat:
# Create a sandbox.
dbdeployer deploy single mysql_8.0.28 -c skip_log_bin -c 'innodb_flush_log_at_trx_commit = 2'
# Create a compressed table.
./use <<< "
CREATE DATABASE test_jfg;
CREATE TABLE test_jfg.t (
id INTEGER AUTO_INCREMENT PRIMARY KEY,
v INTEGER NOT NULL) KEY_BLOCK_SIZE=8"
# Put data in the table.
nrow=$((35 * 1000))
seq -f "(%.0f)" 1 $nrow |
paste -s -d "$(printf ',%.0s' {1..100})\n" |
sed -e 's/.*/INSERT into t(v) values &;/' |
./use test_jfg
# Make the table big enough to generate Buffer Pool cache misses.
{ echo "ALTER TABLE t ADD COLUMN c00 CHAR(255) DEFAULT ''";
seq -f " ADD COLUMN c%02.0f CHAR(255) DEFAULT ''" 1 15; } |
paste -s -d "," | ./use test_jfg
# Table size for reference, default of 128 MB for the Buffer Pool.
ls -lh data/test_jfg/t.ibd
#-rw-r----- 1 jgagne jgagne 108M Apr 20 10:25 data/test_jfg/t.ibd
# Generate decompressions, this is keeping 15 to 20 CPU busy, so needs to be run on a large vm to reach the limit in a reasonable time.
for i in {1..16}; do
seq 1 $(( 1000 * 1000 * 1000 )) |
awk -v s=$i -v nrow=$nrow 'BEGIN{srand(s)} {print int(rand()*nrow)}' |
paste -s -d "$(printf ',%.0s' {1..100})\n" |
sed -e 's/.*/select count(*) from t where id in (&);/' |
./use test_jfg > /dev/null &
done
# Look at how many uncompress_ops we have per seconds, and how long, in hours, it will take to reach the limit.
# In my case, I have 112k OPs and the limit should be reached in 5.3 hours.
n=3; ./use -N information_schema <<< "select * from INNODB_CMP; do sleep($n); select * from INNODB_CMP" |
awk -v ns=$n '$1 == 8192{s=($5-n) / ns; print s, 2*1024*1024*1024/s/60/60;n=$5}' | tail -n 1
112083 5.32214
# Look at the stats, once reaching 2147483647, uncompress_ops is constant even if uncompress_time is increasing.
while date; do ./use information_schema <<< "select * from INNODB_CMP"; sleep 10; echo; done |
tee -a is_INNODB_CMP.txt | grep -e UTC -e ^8192 | paste -s -d " \n"
Wed Apr 20 14:01:00 UTC 2022 8192 18910 18759 1 178 0
Wed Apr 20 14:01:10 UTC 2022 8192 24043 23890 1 195 0
Wed Apr 20 14:01:20 UTC 2022 8192 24043 23890 1 438601 29
Wed Apr 20 14:01:30 UTC 2022 8192 24043 23890 1 1535878 101
Wed Apr 20 14:01:40 UTC 2022 8192 24043 23890 1 2656587 174
[...]
Wed Apr 20 16:05:32 UTC 2022 8192 24043 23890 1 843000853 54566
[...]
Wed Apr 20 18:02:53 UTC 2022 8192 24043 23890 1 1640027485 106121
[...]
Wed Apr 20 19:16:50 UTC 2022 8192 24043 23890 1 2144526712 138707
Wed Apr 20 19:17:00 UTC 2022 8192 24043 23890 1 2145682755 138784
Wed Apr 20 19:17:10 UTC 2022 8192 24043 23890 1 2146862667 138861
Wed Apr 20 19:17:20 UTC 2022 8192 24043 23890 1 2147483647 138938
Wed Apr 20 19:17:30 UTC 2022 8192 24043 23890 1 2147483647 139016
Wed Apr 20 19:17:40 UTC 2022 8192 24043 23890 1 2147483647 139094
[...]
Suggested fix:
Make compress_ops, compress_ops_ok and uncompress_ops in I_S.INNODB_CMP as bigint.
Apply similar fix in other %CMP% tables in I_S.
Note that compress_time and uncompress_time could be kept as int, because this value is in seconds and having an overflow there is unlikely, but I would suggest also going to bigint to avoid a bug there when running MySQL on server with 256 CPU will be commun. :-)