Bug #104194 InnoDB on-disk internal temporary table in 8.0 uses much more space than 5.7
Submitted: 2 Jul 2021 11:28 Modified: 6 Jul 2021 21:04
Reporter: Lei Zeng Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Storage Engines Severity:S3 (Non-critical)
Version:8.0 OS:Linux
Assigned to: CPU Architecture:x86

[2 Jul 2021 11:28] Lei Zeng
Description:
When using the MEMORY storage engine for in-memory temporary tables and data overflow from memory to InnoDB on-disk internal temporary table, MySQL 8.0 uses three times as much space as 5.7. 

This change on 8.0 is unexpected, and has impact on both space allocation and query performance.

8.0.23

mysql> select @@internal_tmp_mem_storage_engine, @@max_heap_table_size, @@tmp_table_size;
+-----------------------------------+-----------------------+------------------+
| @@internal_tmp_mem_storage_engine | @@max_heap_table_size | @@tmp_table_size |
+-----------------------------------+-----------------------+------------------+
| MEMORY                            |              16777216 |         16777216 |
+-----------------------------------+-----------------------+------------------+

mysql> select pad, count(*) from sbtest1 group by pad limit 1;
+-------------------------------------------------------------+----------+
| pad                                                         | count(*) |
+-------------------------------------------------------------+----------+
| 00000019118-77834093222-84130534978-53966205572-00457639552 |        1 |
+-------------------------------------------------------------+----------+
1 row in set (2 min 31.17 sec)

mysql> select * from INFORMATION_SCHEMA.INNODB_SESSION_TEMP_TABLESPACES ;
+----+------------+----------------------------+------------+----------+-----------+
| ID | SPACE      | PATH                       | SIZE       | STATE    | PURPOSE   |
+----+------------+----------------------------+------------+----------+-----------+
| 25 | 4243767289 | ./#innodb_temp/temp_9.ibt  | 4714397696 | ACTIVE   | INTRINSIC |

5.7.33

mysql> select @@internal_tmp_disk_storage_engine, @@max_heap_table_size, @@tmp_table_size;
+------------------------------------+-----------------------+------------------+
| @@internal_tmp_disk_storage_engine | @@max_heap_table_size | @@tmp_table_size |
+------------------------------------+-----------------------+------------------+
| InnoDB                             |              16777216 |         16777216 |
+------------------------------------+-----------------------+------------------+

mysql> select pad, count(*) from sbtest1 group by pad limit 1;
+-------------------------------------------------------------+----------+
| pad                                                         | count(*) |
+-------------------------------------------------------------+----------+
| 00000019118-77834093222-84130534978-53966205572-00457639552 |        1 |
+-------------------------------------------------------------+----------+
1 row in set (1 min 7.30 sec)

mysql> SELECT FILE_NAME, TABLESPACE_NAME, ENGINE, INITIAL_SIZE AS InitialSizeBytes, TOTAL_EXTENTS*EXTENT_SIZE AS TotalSizeBytes, DATA_FREE AS FreeSizeBytes, MAXIMUM_SIZE AS MaxSizeBytes FROM information_schema.FILES WHERE TABLESPACE_NAME = 'innodb_temporary';
+-----------------------------+------------------+--------+------------------+----------------+---------------+--------------+
| FILE_NAME                   | TABLESPACE_NAME  | ENGINE | InitialSizeBytes | TotalSizeBytes | FreeSizeBytes | MaxSizeBytes |
+-----------------------------+------------------+--------+------------------+----------------+---------------+--------------+
| /rdsdbdata/db/innodb/ibtmp1 | innodb_temporary | InnoDB |         12582912 |     1488977920 |    1464860672 |         NULL |
+-----------------------------+------------------+--------+------------------+----------------+---------------+--------------+

 

How to repeat:
1. create a sysbench table of size 2GB

sysbench /usr/local/share/sysbench/oltp_write_only.lua --db-driver=mysql --mysql-user=${username} --mysql-password=${password}  --mysql-db=db1 --mysql-host=${host}  --mysql-port=${port} --tables=1 --table-size=10000000 --forced-shutdown --rand-type=uniform --db-ps-mode=disable --threads=1 prepare

ls -lart sbtest1.*
-rw-r----- 1 mysql mysql 2436890624 Jun  8 06:32 sbtest1.ibd

2. run a GROUP BY query

select pad, count(*) from sbtest1 group by pad limit 1;
[5 Jul 2021 11:33] MySQL Verification Team
Hi Mr. Zeng,

Thank you for your bug report.

However, this is not a bug.

First of all, that storage engine that you use has only fixed-length fields, no variable length one at all. This is one of the reasons why we are using new in-memory temporary tables.

Next, many changes have occurred between 5.7 and 8.0, including the default character set and many bugs associated with that engine have been fixed, resulting in much higher usage of the memory.

Not a bug.
[6 Jul 2021 21:04] Lei Zeng
Thank you very much for your confirmation. I am able to confirm that the table's char set made the difference in my test case. When the sysbench table is created, it takes the default char set that is utf8mb4 in MySQL 8, latin1 in MySQL 5.7. The char set doesn't make difference in the .ibd file and both tables are the same 2GB size. However, the GROUP BY query generates different disk storage for internal temp tables.

The following is a quick test in 8.0.23

mysql> use db2;
Database changed
mysql> CREATE TABLE `sbtest1` (
    ->   `id` int(11) NOT NULL AUTO_INCREMENT,
    ->   `k` int(11) NOT NULL DEFAULT '0',
    ->   `c` char(120) NOT NULL DEFAULT '',
    ->   `pad` char(60) NOT NULL DEFAULT '',
    ->   PRIMARY KEY (`id`),
    ->   KEY `k_1` (`k`)
    -> ) ENGINE=InnoDB  DEFAULT CHARSET=latin1;
Query OK, 0 rows affected, 2 warnings (0.03 sec)

mysql> insert into db2.sbtest1 select * from db1.sbtest1;
Query OK, 10000000 rows affected (2 min 8.91 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> SELECT TABLE_SCHEMA, TABLE_NAME,
    -> DATA_LENGTH/1024/1024/1024 AS "Data size in GB", 
    -> DATA_FREE/1024/1024/1024 AS "DATA FREE in GB", 
    -> INDEX_LENGTH/1024/1024/1024 AS "IndexSize in GB" 
    -> FROM INFORMATION_SCHEMA.TABLES 
    -> where TABLE_SCHEMA like 'db%';
+--------------+------------+-----------------+-----------------+-----------------+
| TABLE_SCHEMA | TABLE_NAME | Data size in GB | DATA FREE in GB | IndexSize in GB |
+--------------+------------+-----------------+-----------------+-----------------+
| db1          | sbtest1    |  2.093750000000 |  0.004882812500 |  0.145172119141 |
| db2          | sbtest1    |  2.093750000000 |  0.006835937500 |  0.228286743164 |
+--------------+------------+-----------------+-----------------+-----------------+

mysql> select pad, count(*) from sbtest1 group by pad limit 1;
+-------------------------------------------------------------+----------+
| pad                                                         | count(*) |
+-------------------------------------------------------------+----------+
| 00000019118-77834093222-84130534978-53966205572-00457639552 |        1 |
+-------------------------------------------------------------+----------+
1 row in set (1 min 9.72 sec)

mysql> select * from INFORMATION_SCHEMA.INNODB_SESSION_TEMP_TABLESPACES ;
+------+------------+----------------------------+------------+----------+-----------+
| ID   | SPACE      | PATH                       | SIZE       | STATE    | PURPOSE   |
+------+------------+----------------------------+------------+----------+-----------+
| 1575 | 4243767287 | ./#innodb_temp/temp_7.ibt  | 1417674752 | ACTIVE   | INTRINSIC |

Thank you
Lei
[7 Jul 2021 12:49] MySQL Verification Team
You are truly welcome !!!!!