Bug #20095 Changing length of VARCHAR field with UTF8 collation does not truncate values
Submitted: 26 May 2006 18:27 Modified: 24 Apr 2007 17:04
Reporter: Andrew Hanna Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.23-BK, 5.0.21 OS:Linux (Linux, OSX 10.4.6 x86)
Assigned to: Alexander Barkov CPU Architecture:Any

[26 May 2006 18:27] Andrew Hanna
Description:
I setup an InnoDB table with a primary key field defined as VARCHAR(26) and then another plain field with the same definition.  Both fields have a UTF8 collation.  After inserting sample values, I changed the length of the fields to 20, but when browsing the data, the data was still 26 long!  From that point, if I dump the data and re-import it, the data is now truncated at 20 chars long.  I also tried this on a MyISAM table and I got the same problem.

I tried my best at searching for a dup bug but did not find one, sorry if it is a dupe.

How to repeat:
Recreate table:

SET FOREIGN_KEY_CHECKS=0;

SET AUTOCOMMIT=0;
START TRANSACTION;

DROP TABLE IF EXISTS `testing`;
CREATE TABLE IF NOT EXISTS `testing` (
  `primary_key` varchar(26) NOT NULL,
  `other_field` varchar(26) NOT NULL,
  PRIMARY KEY  (`primary_key`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `testing` (`primary_key`, `other_field`) VALUES ('abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'),
('zyxwvutsrqponmlkjihgfedcba', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ');

SET FOREIGN_KEY_CHECKS=1;

COMMIT;

Now change the length of both fields:

ALTER TABLE `testing` CHANGE `primary_key` `primary_key` VARCHAR( 20 ) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ,
CHANGE `other_field` `other_field` VARCHAR( 20 ) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL;

SELECT * FROM `testing`;

At first I thought it was a problem with just the primary key, but notice that both fields still have the full length of 26 in there.

Suggested fix:
The only workaround I know of is to either dump data and reimport, or do a SUBSTR() UPDATE type of thing on the fields changed.
[26 May 2006 18:27] Andrew Hanna
Typo in OSX version...
[26 May 2006 20:33] Valeriy Kravchuk
Thank you for a bug report. Verified just as described with 5.0.23-BK on Linux:

mysql> CREATE TABLE IF NOT EXISTS `testing` (
    ->   `primary_key` varchar(26) NOT NULL,
    ->   `other_field` varchar(26) NOT NULL,
    ->   PRIMARY KEY  (`primary_key`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.12 sec)

mysql> INSERT INTO `testing` (`primary_key`, `other_field`) VALUES
    -> ('abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'),
    -> ('zyxwvutsrqponmlkjihgfedcba', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ');
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from `testing`;
+----------------------------+----------------------------+
| primary_key                | other_field                |
+----------------------------+----------------------------+
| abcdefghijklmnopqrstuvwxyz | ABCDEFGHIJKLMNOPQRSTUVWXYZ |
| zyxwvutsrqponmlkjihgfedcba | ABCDEFGHIJKLMNOPQRSTUVWXYZ |
+----------------------------+----------------------------+
2 rows in set (0.01 sec)

mysql> ALTER TABLE `testing` CHANGE `primary_key` `primary_key` VARCHAR( 20 ) CHARACTER
    -> SET utf8 COLLATE utf8_general_ci NOT NULL ,
    -> CHANGE `other_field` `other_field` VARCHAR( 20 ) CHARACTER SET utf8 COLLATE
    -> utf8_general_ci NOT NULL;
Query OK, 2 rows affected (0.02 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from `testing`;
+----------------------------+----------------------------+
| primary_key                | other_field                |
+----------------------------+----------------------------+
| abcdefghijklmnopqrstuvwxyz | ABCDEFGHIJKLMNOPQRSTUVWXYZ |
| zyxwvutsrqponmlkjihgfedcba | ABCDEFGHIJKLMNOPQRSTUVWXYZ |
+----------------------------+----------------------------+
2 rows in set (0.00 sec)

mysql> desc `testing`;
+-------------+-------------+------+-----+---------+-------+
| Field       | Type        | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------+
| primary_key | varchar(20) | NO   | PRI |         |       |
| other_field | varchar(20) | NO   |     |         |       |
+-------------+-------------+------+-----+---------+-------+
2 rows in set (0.01 sec)

I think, it is a bug.
[6 Apr 2007 6:21] Alexander Barkov
A simplier test demonstrating the same problem:

DROP TABLE IF EXISTS t1;
CREATE TABLE IF NOT EXISTS t1 (
  a varchar(26) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO t1 (a) VALUES
('abcdefghijklmnopqrstuvwxyz'),
('zyxwvutsrqponmlkjihgfedcba');

ALTER TABLE t1 CHANGE a a VARCHAR( 20 ) CHARACTER
SET utf8 COLLATE utf8_general_ci NOT NULL;

SELECT * FROM t1;

Result:

a
abcdefghijklmnopqrstuvwxyz
zyxwvutsrqponmlkjihgfedcba
[6 Apr 2007 6:23] Alexander Barkov
The same problem happens with CHAR() column type.
[6 Apr 2007 6:29] Alexander Barkov
Additional info to the previous comment: convert from CHAR to CHAR

DROP TABLE IF EXISTS t1;
CREATE TABLE IF NOT EXISTS t1 (
  a char(26) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO t1 (a) VALUES
('abcdefghijklmnopqrstuvwxyz'),
('zyxwvutsrqponmlkjihgfedcba');

ALTER TABLE t1 CHANGE a a CHAR( 20 ) CHARACTER
SET utf8 COLLATE utf8_general_ci NOT NULL;

SELECT * FROM t1;

abcdefghijklmnopqrst
zyxwvutsrqponmlkjihg
[6 Apr 2007 6:30] Alexander Barkov
The same problem with convert CHAR to VARCHAR:

DROP TABLE IF EXISTS t1;
CREATE TABLE IF NOT EXISTS t1 (
  a char(26) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO t1 (a) VALUES
('abcdefghijklmnopqrstuvwxyz'),
('zyxwvutsrqponmlkjihgfedcba');

ALTER TABLE t1 CHANGE a a VARCHAR( 20 ) CHARACTER
SET utf8 COLLATE utf8_general_ci NOT NULL;

SELECT * FROM t1;

abcdefghijklmnopqrst
zyxwvutsrqponmlkjihg
[6 Apr 2007 6:31] Alexander Barkov
The same problem when convert from VARCHAR to CHAR:

DROP TABLE IF EXISTS t1;
CREATE TABLE IF NOT EXISTS t1 (
  a VARCHAR(26) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO t1 (a) VALUES
('abcdefghijklmnopqrstuvwxyz'),
('zyxwvutsrqponmlkjihgfedcba');

ALTER TABLE t1 CHANGE a a CHAR( 20 ) CHARACTER
SET utf8 COLLATE utf8_general_ci NOT NULL;

SELECT * FROM t1;

a
abcdefghijklmnopqrst
zyxwvutsrqponmlkjihg
[6 Apr 2007 7:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/23968
[9 Apr 2007 7:50] Sergei Glukhov
Please add to do_varstring2_mb the following check:

+  if (length < from_length)
+  {
+    if (current_thd->count_cuted_fields)
+      copy->to_field->set_warning(MYSQL_ERROR::WARN_LEVEL_WARN,
+                                  WARN_DATA_TRUNCATED, 1);
+  }

and test case for this code,
ok to push with changes above.
[10 Apr 2007 11:24] Pekka Nousiainen
approved (with the warning added)
[13 Apr 2007 5:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/24445
[13 Apr 2007 6:21] Alexander Barkov
Pushed into 5.0.40-rpl and 5.1.18-rpl
[20 Apr 2007 17:16] Bugs System
Pushed into 5.0.42
[20 Apr 2007 17:16] Bugs System
Pushed into 5.1.18-beta
[24 Apr 2007 17:04] Paul DuBois
Noted in 5.0.42, 5.1.18 changelogs.

Changing a utf8 column in an InnoDB table to a shorter length did not
shorten the data values.