Bug #7791 Memory storage requirement error
Submitted: 11 Jan 2005 5:49 Modified: 14 Jan 2005 19:53
Reporter: Shelby Moore
Status: Closed
Category:Server: Docs Severity:S3 (Non-critical)
Version:4.0.17 OS:FreeBSD (FreeBSD)
Assigned to: Sergei Golubchik Target Version:
Triage: D4 (Minor)

[11 Jan 2005 5:49] Shelby Moore
Description:
Based on my tests with real HEAP tables, I believe the manual has an error in calculating
the memory storage requirements.

It should be:

SUM_OVER_ALL_KEYS(max_length_of_key) + sizeof(char*) * 2 + ALIGN(length_of_row+1,
sizeof(char*))

The manual currently has:

SUM_OVER_ALL_KEYS(max_length_of_key + sizeof(char*) * 2) + ALIGN(length_of_row+1,
sizeof(char*))

Note the movement of "sizeof(char*) * 2" outside the "SUM_OVER_ALL_KEYS()" loop.

How to repeat:
As proof, here is what "show table status" reports for a real HEAP table:

+----------------+--------+------------+---------+----------------+-------------+---------
--------+--------------+-----------+----------------+---------------------+---------------
------+---------------------+----------------+---------+| Name           | Type   |
Row_format | Rows    | Avg_row_length | Data_length | Max_data_length | Index_length |
Data_free | Auto_increment | Create_time         | Update_time         | Check_time       
  | Create_options | Comment
|+----------------+--------+------------+---------+----------------+-------------+--------
---------+--------------+-----------+----------------+---------------------+--------------
-------+---------------------+----------------+---------+| chunk_heap     | HEAP   |
Dynamic    |   60550 |            147 |     8978808 |         8653449 |       523088 |    
    0 |           NULL | NULL                | NULL                | NULL                |
               |         |

Suggested fix:
Correct the documentation.
[11 Jan 2005 5:49] Shelby Moore
The affected documentation page is:

http://dev.mysql.com/doc/mysql/en/MEMORY_storage_engine.html
[11 Jan 2005 6:39] Shelby Moore
Apparently I am wrong and the documentation is correct.  Apparently "show table status" is
not reporting the actual storage requirements.

I realized this by noting that the example table I gave was giving "table full" error and
the max_heap_table_size is 16777216.  Thus 277 bytes per record, and thus in agreement
with the documentation.

I have no idea why HEAP tables need to be so inefficient to add 2 pointers (8 bytes) per
field per record instead of per record!
[11 Jan 2005 17:14] Sinisa Milivojevic
Thank you for writting to us.

HEAP tables need a pointer per key, and not per field.

Index entries have to point to records and to the next index entry.
[11 Jan 2005 20:20] Shelby Moore
Agreed I misread "KEYS" to be "FIELDS".  My misread is caused partially by failure to
logically understand why you would put a copy of the KEY field in the index!!  Read
below.

I think you can optimize to get rid of 2nd pointer when there is only 1 record per hash
table bucket, which should often be the case if you want good performance from hash table
without using the "move to first" improvement below.  A table of bit flags could be used
for this, so only 1 pointer + 1 bit per row minimum.  However, this may not be a wise
optimization for performance reasons.  And becomes less desireable with the "move to
first" algorithm which can allow you to use much smaller hash tables and get similar
performance to very large hash tables.

Unfortunately and more importantly, I still do not see why you need to repeat the KEY
field in a hash index.  It is random access memory, you can just follow the record pointer
in index to the value.  It only costs you one more memory read and it makes the indexes
more compact to fit into L2 cache.  In fact, with the "move to first" algorithm below, you
only need 512KB static hash table instead of the dynamic hash table you have now:

http://www.cs.rmit.edu.au/~jz/fulltext/ipl01.pdf

Since my KEY field is char(127) and the other fields are 20 bytes (and will be removing
some), this lack of memory optimization will cause my 1GB data to not fit in the ram of
the machine.  And thus I am not able to use HEAP and will have to resort to another
database engine.

Couldn't some work be done on optimizing the size of large HEAP tables?  Else we have to
resort to hokey thing like trying to put a disk table in a RAMdisk or another database
engine such as extremeDB, or coding our own.  Yuk!  A simple optimization to HEAP would
make it much more flexible!

Thanks,
Shelby
[13 Jan 2005 0:38] Sergei Golubchik
This rest of the discussion about HEAP table optimizations is at
http://bugs.mysql.com/7817
[13 Jan 2005 20:37] Shelby Moore
UPDATE:

My tests now reveal that HEAP hashing engine is probably (see next comment) NOT
duplicating the KEY data, at least when the KEY is char(4) or greater in length.

Thus the documented memory storage requirements are wrong and should be as follows, at
least when the KEY is char(4) or greater in length:

SUM_OVER_ALL_KEYS(sizeof(char*) * 2) +
ALIGN(length_of_row+1, sizeof(char*))

Can someone please re-open this bug report????

Here is an example:

mysql> show table status;
+----------------+--------+------------+--------+----------------+-------------+----------
-------+--------------+-----------+----------------+---------------------+----------------
-----+---------------------+----------------+---------+
| Name           | Type   | Row_format | Rows   | Avg_row_length | Data_length |
Max_data_length | Index_length | Data_free | Auto_increment | Create_time         |
Update_time         | Check_time          | Create_options | Comment |
+----------------+--------+------------+--------+----------------+-------------+----------
-------+--------------+-----------+----------------+---------------------+----------------
-----+---------------------+----------------+---------+
| chunk9         | HEAP   | Fixed      |  18117 |             28 |      769120 |      
613566156 |       197632 |         0 |           NULL | NULL                | NULL        
       | NULL                |                |         |
+----------------+--------+------------+--------+----------------+-------------+----------
-------+--------------+-----------+----------------+---------------------+----------------
-----+---------------------+----------------+---------+

CREATE TABLE chunk9
(
	D varchar(9) binary NOT NULL,
	N1 smallint unsigned NOT NULL default '0',
	N2 smallint unsigned NOT NULL default '0',
	N3 smallint unsigned NOT NULL default '0',
	N4 smallint unsigned NOT NULL default '0',
	N5 tinyint unsigned NOT NULL default '0',
	N6 tinyint unsigned NOT NULL default '0',
	P1 tinyint unsigned NOT NULL default '0',
	P2 tinyint unsigned NOT NULL default '0',
	P3 tinyint unsigned NOT NULL default '0',
	P4 tinyint unsigned NOT NULL default '0',
	P5 tinyint unsigned NOT NULL default '0',
	P6 tinyint unsigned NOT NULL default '0',
	P7 tinyint unsigned NOT NULL default '0',
	P8 tinyint unsigned NOT NULL default '0'
) TYPE=HEAP;
[13 Jan 2005 20:43] Shelby Moore
Either the previous comment is incorrect, or MySQL also has a bug in calculating the size
of a HEAP table when comparing against "max_heap_table_size", at least when the KEY is
char(4) or greater in length.

I wrote "probably" in my previous comment, because it was already observed in my earlier
comment in this report that the reported sizes in "show table status" did not match the
size that MySQL is using to compare against "max_heap_table_size", at least when the KEY
is char(4) or greater in length.

So either MySQL is using an extra copy of the KEY data for each record, in which case the
documentation is correct, or MySQL is incorrectly calculating the size when it is not
using an extra copy of the KEY data for each record.

So either we have a documentation error or we have an error in the code.

Can someone please look into this? 

The answer to this affects another bug report:

http://bugs.mysql.com/bug.php?id=7817
[13 Jan 2005 22:57] Shelby Moore
One correction:

"So either we have a documentation error or we have an error in the code."

Should be:

"So either we have a documentation error and an error in the code, or we have an error
that show table status does not show the correct sizes for HEAP table."
[13 Jan 2005 23:09] Sergei Golubchik
Ok, this is what I can tell about the subject.
(I won't try to avoid being technical - looks like you prefer it that way :)

First. HEAP code internally does not support max_heap_table_size limitation, instead it
can limit on max_rows. So at create time it calculates
max_rows=max_heap_table_size/memory_per_row, and then uses this limit.

Indeed, memory_per_row is calculated as described in the manual, so from the
max_heap_table_size point of view, the formula in the manual is correct.

Second. I believe the formula is incorrect. HEAP table does not have a copy of the key
value per key in hash index (but it does in btree index). As you're only interested in
hash indexes, one key takes two pointers only. (see P.S. to know why). Thus the formula
should be (ignoring b-tree keys for a moment)

  number_of_hash_keys * sizeof(char*) * 2 +
  ALIGN(length_of_row+1, sizeof(char*))

(should be fixed both in the manual and in the code). As the used formula gives too high
memory estimation, you can put less data in the HEAP table than max_heap_table_size
specifies.

Third. SHOW TABLE STATUS shows real allocated memory, and it'll show that the table that
is "full" does not take max_heap_table_size.
Also it shows that Index_length/Rows = 8.

Conclusion, what should be fixed: correct the formula in the manual. Correct it in the
code (or limit real allocated memory directly with max_heap_table_size - current approach
won't work anyway when HEAP will support variable-size records).

P.S. mysys/heap.c is not the hash that is used in HEAP tables, it's used by MySQL
internally only. real HEAP storage engine is in heap/ directory. Logic of the hash code is
the same (that's why I pointed to mysys/) but memory allocation is slightly more complex -
not a simple dynamic_array. One key entry is HASH_INFO structure. It only contains a
pointer to a record and a pointer to the next key (in the absense of an array, key entries
are all linked in a list).
[14 Jan 2005 19:53] Sergei Golubchik
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

overestimation in the code fixed. Now it simply limits by max_heap_table_size (at the the
moment of table creation - just as it worked before). Manual corrected.

Fixed in 4.1.10
[14 Jan 2005 21:57] Shelby Moore
Thanks for the quick fix.

As well, I edited my comment in the manual to remove the mention of duplicate KEY field
data, since we confirmed there is no duplication of KEY fiel data for HASH indexes.

I think you meant "mysys/hash.c" when you wrote "mysys/heap.c" in previous comment.  You
are referring to the complementary bug report:

http://bugs.mysql.com/7817

I have not yet had time to see what changes would be needed to implement "MTF" in the
"heap/hash.c", since I did my source code edits to "mysys/hash.c".  Any way, that
discussion will continue in the bug report linked above.