Bug #115252 Optimize Tablename no longer updates information_schema.tables after it has run
Submitted: 7 Jun 2024 14:44 Modified: 10 Jun 2024 11:04
Reporter: IGG t Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.37 OS:Any
Assigned to: CPU Architecture:Any

[7 Jun 2024 14:44] IGG t
Description:
I have recently been running a process of optimizing various tables following a large purge of data, and there appears to be a difference between 5.7 and 8.0 that I can't see documented anywhere.

In MySQL 5.7 after the optimize completes, the DATA_FREE value in information_schema.tables is shrunk to reflect the fact that the space has been re-claimed and the DATA_LENGTH + INDEX_LENGTH (roughly) matches the size of the ibd file on the server.

However, I have also been upgrading to MySQL 8.0 recently, and have found that after I have run an optimize command on here the information_schema.tables.data_free value still shows the same as before the optimize. The actual ibd file has shrunk, so I know it has worked.

n.b. all tables are innodb

How to repeat:
Create a table
insert a large amount of data
delete a large amount of data

Check the data_free value
select table_name, table_schema, data_free FROM information_schema.TABLES where table_name = <tablename>;

optimize the table:
OPTIMIZE NO_WRITE_TO_BINLOG <tablename>;

re-check the data_free value:
select table_name, table_schema, data_free FROM information_schema.TABLES where table_name = <tablename>;

In MySQL 5.7 the data_free value is now around 7MB
In MySQL 8.0 the data_free value is still the same as it was to start with

Suggested fix:
make the information_schema.TABLES update the data_free (and other values) after an optimize command has been run.
[7 Jun 2024 15:00] MySQL Verification Team
Hi Mr. IGG,

Thank you for your bug report.

We have tried to come to the same bug, but with no success.

We need a fully repeatable test case for you.

Make sure that it also includes some DELETE's, so that the bug would be evident.

Can't repeat.
[10 Jun 2024 5:22] MySQL Verification Team
Looks related to this?

https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_information_sc...
[10 Jun 2024 11:04] MySQL Verification Team
Hi Mr. IGG,

Actually, my esteemed colleague is quite correct. This is from our Reference Manual:

______________________________________________________________________________

information_schema_stats_expiry

Command-Line Format	--information-schema-stats-expiry=#
System Variable	information_schema_stats_expiry
Scope	Global, Session
Dynamic	Yes
SET_VAR Hint Applies	No
Type	Integer
Default Value	86400
Minimum Value	0
Maximum Value	31536000
Unit	seconds
Some INFORMATION_SCHEMA tables contain columns that provide table statistics:

STATISTICS.CARDINALITY
TABLES.AUTO_INCREMENT
TABLES.AVG_ROW_LENGTH
TABLES.CHECKSUM
TABLES.CHECK_TIME
TABLES.CREATE_TIME
TABLES.DATA_FREE
TABLES.DATA_LENGTH
TABLES.INDEX_LENGTH
TABLES.MAX_DATA_LENGTH
TABLES.TABLE_ROWS
TABLES.UPDATE_TIME
Those columns represent dynamic table metadata; that is, information that changes as table contents change.

By default, MySQL retrieves cached values for those columns from the mysql.index_stats and mysql.table_stats dictionary tables when the columns are queried, which is more efficient than retrieving statistics directly from the storage engine. If cached statistics are not available or have expired, MySQL retrieves the latest statistics from the storage engine and caches them in the mysql.index_stats and mysql.table_stats dictionary tables. Subsequent queries retrieve the cached statistics until the cached statistics expire. A server restart or the first opening of the mysql.index_stats and mysql.table_stats tables do not update cached statistics automatically.

The information_schema_stats_expiry session variable defines the period of time before cached statistics expire. The default is 86400 seconds (24 hours), but the time period can be extended to as much as one year.

To update cached values at any time for a given table, use ANALYZE TABLE.

To always retrieve the latest statistics directly from the storage engine and bypass cached values, set information_schema_stats_expiry to 0.

Querying statistics columns does not store or update statistics in the mysql.index_stats and mysql.table_stats dictionary tables under these circumstances:

When cached statistics have not expired.

When information_schema_stats_expiry is set to 0.

When the server is in read_only, super_read_only, transaction_read_only, or innodb_read_only mode.

When the query also fetches Performance Schema data.

The statistics cache may be updated during a multiple-statement transaction before it is known whether the transaction commits. As a result, the cache may contain information that does not correspond to a known committed state. This can occur with autocommit=0 or after START TRANSACTION.

information_schema_stats_expiry is a session variable, and each client session can define its own expiration value. Statistics that are retrieved from the storage engine and cached by one session are available to other sessions.
______________________________________________________________________________
 
Not a bug, since it is properly documented behaviour.