Bug #30619 Length for CHAR and VARCHAR supposed to be a number of chars, not bytes
Submitted: 24 Aug 2007 9:20 Modified: 24 Aug 2007 10:19
Reporter: Roland Bouman Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Data Types Severity:S2 (Serious)
Version:5.0.45 OS:Linux (kubuntu feistt)
Assigned to: CPU Architecture:Any

[24 Aug 2007 9:20] Roland Bouman
Description:
For Char(x) and Varchar(x), x is supposed to be the number of characters that can be stored in the columns of that type. In reality, it will not allow storage of x multi-byte characters

How to repeat:
create table strings (c char(255) character set utf8, vc varchar(255) character set utf8);

insert into strings (c,vc) values (repeat('c',255),repeat('全',255));
ERROR 1406 (22001): Data too long for column 'vc' at row 1

insert into strings (c,vc) values (repeat('全',255),repeat('v',255));
ERROR 1406 (22001): Data too long for column 'c' at row 1

mysql> select version();
+------------+
| version()  |
+------------+
| 5.0.45-log |
+------------+

(the test gives me the same result for InnoDB as well as for MyISAM)

Suggested fix:
Please store number of characters, not number of bytes.
[24 Aug 2007 10:04] Sveta Smirnova
Thank you for the report.

Please provide output of SHOW VARIABLES LIKE '%char%';
[24 Aug 2007 10:09] Roland Bouman
mysql> show variables like '%char%';
+--------------------------+-----------------------------------------------------------------------------+
| Variable_name            | Value                                                                       |
+--------------------------+-----------------------------------------------------------------------------+
| character_set_client     | latin1                                                                      |
| character_set_connection | latin1                                                                      |
| character_set_database   | latin1                                                                      |
| character_set_filesystem | binary                                                                      |
| character_set_results    | latin1                                                                      |
| character_set_server     | latin1                                                                      |
| character_set_system     | utf8                                                                        |
| character_sets_dir       | /opt/mysql/5.0/45/mysql-5.0.45-linux-i686-icc-glibc23/share/mysql/charsets/ |
+--------------------------+-----------------------------------------------------------------------------+
8 rows in set (0.00 sec)
[24 Aug 2007 10:19] Sveta Smirnova
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://dev.mysql.com/doc/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

Provided output shows values of character_set_client, character_set_connection and character_set_results are latin1, but character_set_system is utf8. So truncation of values is expected. You have to use SET NAMES before running such queries.