Bug #17918 TINYTEXT (and other TEXTs) are byte-limited rather than char-limited
Submitted: 4 Mar 2006 16:17 Modified: 10 Nov 2007 0:07
Reporter: Domas Mituzas Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Data Types Severity:S3 (Non-critical)
Version:4.1-bk, 5.0-bk, 5.1-bk OS:Linux (Linux, MacOSX)
Assigned to: Paul DuBois CPU Architecture:Any

[4 Mar 2006 16:17] Domas Mituzas
Description:
According to documentation, TINYTEXT is 255 characters, not bytes. 
Though TEXT field limits are defined in bytes and not characters:

How to repeat:
mysql> create table tttest (a tinytext charset utf8);
Query OK, 0 rows affected (0.01 sec)

mysql> set names utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into tttest values (repeat('\304\205',254));
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+----------------------------------------+
| Level   | Code | Message                                |
+---------+------+----------------------------------------+
| Warning | 1265 | Data truncated for column 'a' at row 1 |
+---------+------+----------------------------------------+
1 row in set (0.00 sec)

mysql> select char_length(a),length(a) from tttest;
+----------------+-----------+
| char_length(a) | length(a) |
+----------------+-----------+
|            127 |       254 |
+----------------+-----------+
1 row in set (0.00 sec)

Suggested fix:
provide consistent specification in docs or change limiting to characters rather than bytes
[4 Mar 2006 16:18] Domas Mituzas
Verified at ChangeSet@1.2214
[4 Mar 2006 16:24] Paul DuBois
The insert statement as shown does not specify a literal
single utf8 character.  It can be written like this instead, which
also demonstrates the problem (assuming that set names utf8
has been executed):

insert into tttest values (repeat(char(196,172),254));
[6 Apr 2006 7:00] KimSeong Loh
So, is this considered a software bug or documentation error?
[10 Nov 2007 0:07] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

The documentation was in error. The length prefix stored for variable-length string types stores the length in *bytes* (not characters), even for character types (VARCHAR and the TEXT types).

The documentation has been corrected. The primary affected sections are:

http://dev.mysql.com/doc/refman/5.0/en/string-type-overview.html
http://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html