Bug #7791 Memory storage requirement error
Submitted: 11 Jan 2005 4:49 Modified: 14 Jan 2005 18:53
Reporter: Shelby Moore Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:4.0.17 OS:FreeBSD (FreeBSD)
Assigned to: Sergei Golubchik CPU Architecture:Any
Triage: D4 (Minor)

[11 Jan 2005 4:49] Shelby Moore
Description:
Based on my tests with real HEAP tables, I believe the manual has an error in calculating the memory storage requirements.

It should be:

SUM_OVER_ALL_KEYS(max_length_of_key) + sizeof(char*) * 2 + ALIGN(length_of_row+1, sizeof(char*))

The manual currently has:

SUM_OVER_ALL_KEYS(max_length_of_key + sizeof(char*) * 2) + ALIGN(length_of_row+1, sizeof(char*))

Note the movement of "sizeof(char*) * 2" outside the "SUM_OVER_ALL_KEYS()" loop.

How to repeat:
As proof, here is what "show table status" reports for a real HEAP table:

+----------------+--------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+----------------+---------+| Name           | Type   | Row_format | Rows    | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time          | Create_options | Comment |+----------------+--------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+----------------+---------+| chunk_heap     | HEAP   | Dynamic    |   60550 |            147 |     8978808 |         8653449 |       523088 |         0 |           NULL | NULL                | NULL                | NULL                |                |         |

Suggested fix:
Correct the documentation.
[11 Jan 2005 4:49] Shelby Moore
The affected documentation page is:

http://dev.mysql.com/doc/mysql/en/MEMORY_storage_engine.html
[11 Jan 2005 5:39] Shelby Moore
Apparently I am wrong and the documentation is correct.  Apparently "show table status" is not reporting the actual storage requirements.

I realized this by noting that the example table I gave was giving "table full" error and the max_heap_table_size is 16777216.  Thus 277 bytes per record, and thus in agreement with the documentation.

I have no idea why HEAP tables need to be so inefficient to add 2 pointers (8 bytes) per field per record instead of per record!
[11 Jan 2005 16:14] Sinisa Milivojevic
Thank you for writting to us.

HEAP tables need a pointer per key, and not per field.

Index entries have to point to records and to the next index entry.
[11 Jan 2005 19:20] Shelby Moore
Agreed I misread "KEYS" to be "FIELDS".  My misread is caused partially by failure to logically understand why you would put a copy of the KEY field in the index!!  Read below.

I think you can optimize to get rid of 2nd pointer when there is only 1 record per hash table bucket, which should often be the case if you want good performance from hash table without using the "move to first" improvement below.  A table of bit flags could be used for this, so only 1 pointer + 1 bit per row minimum.  However, this may not be a wise optimization for performance reasons.  And becomes less desireable with the "move to first" algorithm which can allow you to use much smaller hash tables and get similar performance to very large hash tables.

Unfortunately and more importantly, I still do not see why you need to repeat the KEY field in a hash index.  It is random access memory, you can just follow the record pointer in index to the value.  It only costs you one more memory read and it makes the indexes more compact to fit into L2 cache.  In fact, with the "move to first" algorithm below, you only need 512KB static hash table instead of the dynamic hash table you have now:

http://www.cs.rmit.edu.au/~jz/fulltext/ipl01.pdf

Since my KEY field is char(127) and the other fields are 20 bytes (and will be removing some), this lack of memory optimization will cause my 1GB data to not fit in the ram of the machine.  And thus I am not able to use HEAP and will have to resort to another database engine.

Couldn't some work be done on optimizing the size of large HEAP tables?  Else we have to resort to hokey thing like trying to put a disk table in a RAMdisk or another database engine such as extremeDB, or coding our own.  Yuk!  A simple optimization to HEAP would make it much more flexible!

Thanks,
Shelby
[12 Jan 2005 23:38] Sergei Golubchik
This rest of the discussion about HEAP table optimizations is at http://bugs.mysql.com/7817
[13 Jan 2005 19:37] Shelby Moore
UPDATE:

My tests now reveal that HEAP hashing engine is probably (see next comment) NOT duplicating the KEY data, at least when the KEY is char(4) or greater in length.

Thus the documented memory storage requirements are wrong and should be as follows, at least when the KEY is char(4) or greater in length:

SUM_OVER_ALL_KEYS(sizeof(char*) * 2) +
ALIGN(length_of_row+1, sizeof(char*))

Can someone please re-open this bug report????

Here is an example:

mysql> show table status;
+----------------+--------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+----------------+---------+
| Name           | Type   | Row_format | Rows   | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time          | Create_options | Comment |
+----------------+--------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+----------------+---------+
| chunk9         | HEAP   | Fixed      |  18117 |             28 |      769120 |       613566156 |       197632 |         0 |           NULL | NULL                | NULL                | NULL                |                |         |
+----------------+--------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+----------------+---------+

CREATE TABLE chunk9
(
	D varchar(9) binary NOT NULL,
	N1 smallint unsigned NOT NULL default '0',
	N2 smallint unsigned NOT NULL default '0',
	N3 smallint unsigned NOT NULL default '0',
	N4 smallint unsigned NOT NULL default '0',
	N5 tinyint unsigned NOT NULL default '0',
	N6 tinyint unsigned NOT NULL default '0',
	P1 tinyint unsigned NOT NULL default '0',
	P2 tinyint unsigned NOT NULL default '0',
	P3 tinyint unsigned NOT NULL default '0',
	P4 tinyint unsigned NOT NULL default '0',
	P5 tinyint unsigned NOT NULL default '0',
	P6 tinyint unsigned NOT NULL default '0',
	P7 tinyint unsigned NOT NULL default '0',
	P8 tinyint unsigned NOT NULL default '0'
) TYPE=HEAP;
[13 Jan 2005 19:43] Shelby Moore
Either the previous comment is incorrect, or MySQL also has a bug in calculating the size of a HEAP table when comparing against "max_heap_table_size", at least when the KEY is char(4) or greater in length.

I wrote "probably" in my previous comment, because it was already observed in my earlier comment in this report that the reported sizes in "show table status" did not match the size that MySQL is using to compare against "max_heap_table_size", at least when the KEY is char(4) or greater in length.

So either MySQL is using an extra copy of the KEY data for each record, in which case the documentation is correct, or MySQL is incorrectly calculating the size when it is not using an extra copy of the KEY data for each record.

So either we have a documentation error or we have an error in the code.

Can someone please look into this? 

The answer to this affects another bug report:

http://bugs.mysql.com/bug.php?id=7817
[13 Jan 2005 21:57] Shelby Moore
One correction:

"So either we have a documentation error or we have an error in the code."

Should be:

"So either we have a documentation error and an error in the code, or we have an error that show table status does not show the correct sizes for HEAP table."
[13 Jan 2005 22:09] Sergei Golubchik
Ok, this is what I can tell about the subject.
(I won't try to avoid being technical - looks like you prefer it that way :)

First. HEAP code internally does not support max_heap_table_size limitation, instead it can limit on max_rows. So at create time it calculates max_rows=max_heap_table_size/memory_per_row, and then uses this limit.

Indeed, memory_per_row is calculated as described in the manual, so from the max_heap_table_size point of view, the formula in the manual is correct.

Second. I believe the formula is incorrect. HEAP table does not have a copy of the key value per key in hash index (but it does in btree index). As you're only interested in hash indexes, one key takes two pointers only. (see P.S. to know why). Thus the formula should be (ignoring b-tree keys for a moment)

  number_of_hash_keys * sizeof(char*) * 2 +
  ALIGN(length_of_row+1, sizeof(char*))

(should be fixed both in the manual and in the code). As the used formula gives too high memory estimation, you can put less data in the HEAP table than max_heap_table_size specifies.

Third. SHOW TABLE STATUS shows real allocated memory, and it'll show that the table that is "full" does not take max_heap_table_size.
Also it shows that Index_length/Rows = 8.

Conclusion, what should be fixed: correct the formula in the manual. Correct it in the code (or limit real allocated memory directly with max_heap_table_size - current approach won't work anyway when HEAP will support variable-size records).

P.S. mysys/heap.c is not the hash that is used in HEAP tables, it's used by MySQL internally only. real HEAP storage engine is in heap/ directory. Logic of the hash code is the same (that's why I pointed to mysys/) but memory allocation is slightly more complex - not a simple dynamic_array. One key entry is HASH_INFO structure. It only contains a pointer to a record and a pointer to the next key (in the absense of an array, key entries are all linked in a list).
[14 Jan 2005 18:53] Sergei Golubchik
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

overestimation in the code fixed. Now it simply limits by max_heap_table_size (at the the moment of table creation - just as it worked before). Manual corrected.

Fixed in 4.1.10
[14 Jan 2005 20:57] Shelby Moore
Thanks for the quick fix.

As well, I edited my comment in the manual to remove the mention of duplicate KEY field data, since we confirmed there is no duplication of KEY fiel data for HASH indexes.

I think you meant "mysys/hash.c" when you wrote "mysys/heap.c" in previous comment.  You are referring to the complementary bug report:

http://bugs.mysql.com/7817

I have not yet had time to see what changes would be needed to implement "MTF" in the "heap/hash.c", since I did my source code edits to "mysys/hash.c".  Any way, that discussion will continue in the bug report linked above.