Bug #108763 InnoDB Inefficient Memory Allocation for Data Dictionary
Submitted: 13 Oct 2022 12:11 Modified: 17 Oct 2022 13:29
Reporter: Vasanth S Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Data Dictionary Severity:S2 (Serious)
Version:5.7.35, 8.0.30 OS:Any
Assigned to: CPU Architecture:Any
Tags: datadictionary, Memory, memoryallocation, tabledefinitioncache

[13 Oct 2022 12:11] Vasanth S
Description:
Memory Heap Allocation in InnoDB:

MySQL version: 5.7.35

When opening a table, the table metadata along with the primary key index and the other indexes metadata are allocated memory in the data dictionary.

For indexes, the flow of memory allocation is as follows:

i) sizeof(*index)
ii) strlen(index->name)+1
iii) 1+ n_fields * sizeof(dict_field_t)  	//ends here for non-clustered indexes.
iv) dict_index_get_n_unique(new_index)* sizeof(*new_index-> stat_n_diff_key_vals) Similarly, for stat_n_sample_sizes, stat_n_non_null_key_vals.
v) sizeof(btr_search_t)

However, there is a lot of space left unused in the block when allocating memory in this order. For example, let's consider a table 'testdb.TestDataTable1' with 144 columns.

Current Approach:
	Memory is allocated in the above order.  For PK index memory allocation,
	
The first block has block->len=240 and block->free=136.

i) sizeof(*index) = 600

New block of size 2*len= 480 is created. Since 480<600, new_size= 600. len = MEM_BLOCK_HEADER_SIZE + MEM_SPACE_NEEDED(600) => 136+600 = 736. Therefore, block->len=736 and block->free=736(136+600). The new block is added to the end of the heap and block->len is added to heap->total_size i.e 976(240+736).

ii) strlen(index->name)+1 = 8

New block of size 2*len = 1472 is created. Since 1472>8, new_size=1472. len = MEM_BLOCK_HEADER_SIZE + MEM_SPACE_NEEDED(1472) => 136+1472 = 1608. Therefore, block->len= 1608 and block->free=144(136+8). The new block is added to the end of the heap and block->len is added to heap->total_size i.e 2584(976+1608).

iii) 1 + n_fields * sizeof(dict_field_t) = 1+ (148*24) = 3553

The current block has 1464 bytes remaining space. But, 3553 bytes needs to be allocated. Therefore, a new block of size 2*len= 3216 is created. Since 3216<3553, new_size=3553. len = MEM_BLOCK_HEADER_SIZE + MEM_SPACE_NEEDED(3353) => 136+3560 = 3696. Therefore, block->len= 3696 and block->free= 3696(136+3560). The new block is added to the end of the heap and block->len is added to the heap->total_size i.e 6280(2584+3696). 

iv) dict_index_get_n_unique(new_index)* sizeof(*new_index-> stat_n_diff_key_vals)

stat_n_diff_key_vals = 8

Since the current block is full, a new block of size 2*len= 7392 is created. Since 7392>8, new_size=7392. len = MEM_BLOCK_HEADER_SIZE + MEM_SPACE_NEEDED(7392) => 136 + 7392 = 7528. Therefore, block->len = 7528 and block->free = 144(136+8). The new block is added to the end of the heap and block->len is added to heap->total_size i.e 13808 (6280+7528)

stat_n_sample_sizes = 8

Memory is allocated in the current block. block->len=13808 and block->free=152(144+8)

stat_n_non_null_key_vals = 8

Memory is allocated in the current block. block->len=13808 and block->free=160(152+8)

v) sizeof(btr_search_t) = 72

Memory is allocated in the current block. block->len=13808 and block->free= 232(160+72)

Therefore, PRIMARY KEY INDEX occupies 13808 bytes of memory in the Data dictionary.

The above has been checked with version 5.7.39 as well. There doesn't appear to be any change in source code.

Observation:

Even though there are some predefined values of less size, they are not inserted in the beginning when there's enough remaining space in the block, because of which new blocks of size 2*len or greater is created thereby resulting in bigger heap size. For 104 bytes to be allocated(iv and v), an extra block of size 7528 bytes has been created. For 1 lakh such tables, Primary Key index metadata alone occupies 1.3808 GB of memory in the data dictionary. This can be reduced to 467.2MB by following the proposed approach.

Proposed Approach:
	Add all the predefined values like strlen(index->name)+1 which has a fixed value of 8 for all PRIMARY key indexes and sizeof(btr_search_t)=72 in the first block itself. If dict_index_get_n_unique(new_index) =1, we can also add the stat_n_diff_key_vals, stat_n_sample_sizes and stat_n_non_null_key_vals in the first block itself to save extra block from being created. Each of these is allocated 8 bytes.

For Primary Key Index memory allocation,

The first block has block->len=240 and block->free=136.

i) strlen(index_name)+1 = 8

Memory is allocated in the current block itself. block->len=240 and block->free=144(136+8)

ii) sizeof(btr_search_t) = 72

Memory is allocated in the current block. block->len=240 and block->free=216(144+72)

iii) dict_index_get_n_unique(new_index)* sizeof(*new_index-> stat_n_diff_key_vals) 

stat_n_diff_key_vals = 8

Memory is allocated in the current block. block->len=240 and block->free=224(216+8)

stat_n_sample_sizes = 8

Memory is allocated in the current block. block->len=240 and block->free=232(224+8)

stat_n_non_null_key_vals = 8

Memory is allocated in the current block. block->len=240 and block->free=240(232+8)
	
iv) sizeof(*index) = 600

Since the current block is full, a new block of size 2*len=480 is created. Since 480<600, new_size=600. len= MEM_BLOCK_HEADER_SIZE + MEM_SPACE_NEEDED(600) => 136+600= 736. Therefore, block->len=736 and block->free=736(136+600). The block is inserted at the end of the heap and the block->len is added to the heap->total_size variable i.e 976(240+736).

v) 1 + n_fields * sizeof(dict_field_t) = 1+ (148*24) = 3553

New block of size 2*len=1472 is created. Since 1472<3553, new_size=3553. len = MEM_BLOCK_HEADER_SIZE + MEM_SPACE_NEEDED(3553) => 136 + 3560 = 3696. Therefore, block->len = 3696 and block->free=3696(136+3560). The block is inserted at the end of the heap and the block->len is added to the heap->total_size i.e 4672. 

How to repeat:
The above can be repeated by reading any table. Table details: 
Table Name: testdb.TestDataTable1
Number of columns: 144
PK Column: ID bigint 

Suggested fix:
The above applies for non clustered indexes as well. Therefore, adding the predefined fixed values initially and then allocating space for the dynamic values would be better. This will have a huge impact in terms of system memory consumption.
[13 Oct 2022 13:33] MySQL Verification Team
Hi Mr. Vasanth,

Thank you for your bug report.

However , this is not a bug.

Cacheing data in the version 5.7 and in the version 8.0 will not be changed. Furthermore, there are no further improvements planned for 5.7, but only 8.0.

Unlike 5.7, 8.0 has a proper data dictionary, with a totally different algorithms for the memory allocation. Memory allocation for data dictionary in 8.0 is optimised for speed and not for size, since gigabytes and terabytes of RAM are now easily available.

If you have any feature requests or performance enhancements for 8.0, then, please, file a separate bug report.

Not a bug.
[17 Oct 2022 6:58] Vasanth S
Greetings team, Thank you for your immediate response. However,

"Unlike 5.7, 8.0 has a proper data dictionary, with a totally different algorithms for the memory allocation"

On checking the MySQL 8.0.30, the order of memory allocation in dict_mem_index_create(), dict_mem_fill_index_struct() methods remains the same. 

"Memory allocation for data dictionary in 8.0 is optimised for speed and not for size, since gigabytes and terabytes of RAM are now easily available"

We totally understand. However, On changing the order in which the memory is allocated, the size of the data dictionary can be reduced hugely. Our proposal is to allocate the known smaller memories initially itself so we can make use of the available RAM efficiently. 

Thank you for being kind enough to let us share our suggestions here. Please note that bringing up this small change will be beneficial for all MySQL users.
[17 Oct 2022 12:33] MySQL Verification Team
Hi Mr. S,

Please, read our responses carefully.

Memory allocation in 8.0 is optimised for speed and not size. That is what we wrote.

Hence, if we allocate smaller amount of memory at the start, then all our users would experience a significant slowdown during the running of the server. That is because additional memory would need to be allocated during the runtime, which would slow down all operations.

Not a bug.
[17 Oct 2022 13:29] Vasanth S
Hi Verification Team,

Please read our request carefully. 

Our request was not to lessen the size of the memory block created but to change the order of metadata allocation in those blocks, particularly for Dictionary memory (@0mem.c). 

"That is because additional memory would need to be allocated during the runtime, which would slow down all operations."

Even in your current flow, Memory is being allocated during run time only. A fresh heap is created only when a table is read. Only after all the metadata is loaded, the heap->total_size is added to the dict_sys->size variable. I don't see any issues that will arise because of changing the order in which this memory is allocated. 

Awaiting your response.
[17 Oct 2022 13:57] MySQL Verification Team
Hi,

Reducing memory allocation by small amount is not an objective for us. We do not see how that would improve performance.

You are discussing very small memory allocations, which are insignificant for the computers on which our server is running.

However, if you can show how can this improve performance significantly, then this could be analysed as a performance enhancement request.