Bug #27239 Relationship #chars and #bytes unclear for utf8 varchars
Submitted: 17 Mar 2007 14:32 Modified: 16 Jun 2007 23:32
Reporter: Roland Bouman Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:all OS:Any
Assigned to: Paul DuBois CPU Architecture:Any
Tags: storage requirements, utf8, varchar

[17 Mar 2007 14:32] Roland Bouman
Description:
In http://dev.mysql.com/doc/refman/5.1/en/storage-requirements.html, the word 'length' is used sometimes to indicate a number of bytes used to store a particular value, and at other times to indicate the number of characters stored in a string data type.

"The actual length of the column value"

bytes, or number of characters?

"The column's maximum possible length"

bytes, or number of characters?

"For example, a VARCHAR(10) column can hold a string with a maximum length of 10. "

"For a breakdown of the storage used for different categories of utf8 characters, see Section 10.7, “Unicode Support”."

Here, nor in the referred document is explained that at DDL time, MySQL assumes worst case, and things each varchar utf8 character should be able to require 3 bytes. The consequence is that the maximum utf8 varchar is only 21844 characters.

"Note: The effective maximum length for a VARCHAR or VARBINARY column is 65,532."

this should be:

"Note: The effective maximum number of bytes that can be stored in a VARCHAR or VARBINARY column is 65,532."

"For example, a VARCHAR(10) column can hold a string of 10 characters or less." 

"Two bytes rather than one are required for the prefix because the length of the column is greater than 255 characters."

Prefix? what is that, the prefix? this is not mentioned before. Maybe what is meant is the bytes that are used to store the number of characters in the varchar. If so, specify.

How to repeat:
NA

Suggested fix:
NA
[17 Mar 2007 14:36] Roland Bouman
I forgot to mention that this

"The maximum size of a row in a MyISAM table is 65,534 bytes. Each BLOB and TEXT column accounts for only five to nine bytes toward this size."

seems to imply this limitation is only in effect for MyISAM tables. however, I'm equally unable to create InnoDB, ARCHIVE, CSV, MEMORY, BLACKHOLE etc tables:

"create table t_char(     t varchar(21845) character set utf8) engine= archive;
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. You have to change some columns to TEXT or BLOBs
"

"create table t_char(     t varchar(21845) character set utf8) engine= innodb;
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. You have to change some columns to TEXT or BLOBs"
[17 Mar 2007 19:58] Valeriy Kravchuk
Thank you for a reasonable documentation request.
[16 Jun 2007 23:32] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.