Bug #29042 Need documentation page on how InnoDB handles blobs
Submitted: 12 Jun 2007 4:53 Modified: 30 Apr 2009 15:46
Reporter: Alexander Rubin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S4 (Feature request)
Version:any OS:Any
Assigned to: Paul DuBois CPU Architecture:Any
Tags: blobs, documentation, innodb

[12 Jun 2007 4:53] Alexander Rubin
Description:
Original test case:

CREATE table ib_16k (
       i int primary key not null auto_increment,
       c1 varchar(2000) , c2 varchar(2000), c3 varchar(2000), c4
 varchar(2000),
       c5 varchar(2000), c6 varchar(2000), c7 varchar(2000), c8
 varchar(2000) )
 ENGINE=innodb;

DO @str2k := repeat ('2000chars_',200) ; 

INSERT INTO ib_16k VALUES (NULL, @str2k, @str2k, @str2k, @str2k, @str2k, @str2k, @str2k, @str2k);

-- this will insert 8 2000bytes chars in the table

SHOW TABLE STATUS like 'ib_16k_1row' \G

           Name: ib_16k_1row
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 1
 Avg_row_length: 131072
    Data_length: 131072
Max_data_length: 0
   Index_length: 0
      Data_free: 0
 Auto_increment: 2
    Create_time: 2007-06-06 19:56:37
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 
We have added one 16K row in the table, but get 128K data.

However, if we will add only 8000 bytes with the same large varchars or texts or blobs, it will get only one 16K data.

According to the documentation, InnoDB put first 768 chars of each text/blob in the page and will allocate some space outside of the page. 

The following things are missing in the doc:
* if the total size of the row is less than 8100 bytes InnoDB will not allocate additional space for blobs, even if each blob is larger than 768.
* InnoDB will allocate additional space outside of the page for _each_ blob (if we have 8 blobs 8xN bytes will be allocated).
* How much space InnoDB allocates for each blob.

How to repeat:
http://dev.mysql.com/doc/refman/5.0/en/innodb.html page does not contain complete  information about how InnoDB handles blobs/texts fields

Suggested fix:
Create doc page "How InnoDB handles blobs/text"
[13 Jun 2007 9:58] Valeriy Kravchuk
Thank you for a reasonable documentation request.
[28 Jul 2008 14:12] MC Brown
Assigning to Paul
[15 Apr 2009 16:43] Paul DuBois
I ran some additional tests, checking the Data_length value for the
table for various amounts of data inserted into the 8 table string
columns.  I did not try to find the exact cutoff when the Data_length
values changed, but the results show a pattern:

Bytes inserted    Data_length
8 x 1000 = 8000   16K
8 x 1010 = 8080   32k
8 x 1050 = 8400   48K
8 x 1100 = 8800   64K
8 x 1150 = 9200   80K
8 x 1250 = 10000  96K
8 x 1400 = 11200  112K
8 x 1700 = 13600  128K
[24 Apr 2009 19:18] Paul DuBois
re:

"The following things are missing in the doc:
* if the total size of the row is less than 8100 bytes InnoDB will not allocate additional
space for blobs, even if each blob is larger than 768."

Heikki says:

"The old InnoDB file format always stored 768 first bytes of every column locally and was only able to store non-locally the remaining part."

(Here, "old" is in comparison with the Barracuda format available with the InnoDB plugin.)

re:

"* InnoDB will allocate additional space outside of the page for _each_ blob (if we have 8
blobs 8xN bytes will be allocated)."

For now I have updated the description in the manual as follows:

The maximum row length, except for VARBINARY, VARCHAR, BLOB and TEXT
columns, is slightly less than half of a database page. That is, the
maximum row length is about 8000 bytes. LONGBLOB and LONGTEXT columns
must be less than 4GB, and the total row length, including BLOB and
TEXT columns, must be less than 4GB. InnoDB stores the first 768 
bytes of a VARBINARY, VARCHAR, BLOB, or TEXT column in the row, and
the rest into separate overflow pages. Each such column has its own
list of overflow pages. The 768-byte prefix is accompanied by a 
20-byte value that stores the true length of the column and points
into the overflow list where the rest of the value is stored.
[30 Apr 2009 15:46] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

http://dev.mysql.com/doc/refman/5.1/en/innodb-file-space.html
describes page allocation to table segments. I've added this to the
section as information about BLOB and other variable-length column
storage allocation:

"The maximum row length, except for variable-length columns
(VARBINARY, VARCHAR, BLOB and TEXT), is slightly less than half of a
database page. That is, the maximum row length is about 8000 bytes.
LONGBLOB and LONGTEXT columns must be less than 4GB, and the total
row length, including BLOB and TEXT columns, must be less than 4GB.

If a row is less than half a page long, all of it is stored locally
within the page. If it exceeds half a page, variable-length columns
are chosen for external off-page storage until the row fits within
half a page. For a column chosen for off-page storage, InnoDB stores
the first 768 bytes locally in the row, and the rest externally into
overflow pages. Each such column has its own list of overflow pages.
The 768-byte prefix is accompanied by a 20-byte value that stores the
true length of the column and points into the overflow list where the
rest of the value is stored."

Pages are allocated to a segment as described elsewhere within the
section (page at a time up to 32 pages, extent (64 pages) at a time
after that).