Bug #90265 Inconsistency in size truncating when converting to a smaller string data_type
Submitted: 30 Mar 2018 19:05 Modified: 6 Apr 2018 11:01
Reporter: Carlos Tutte Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.7.21, 5.6.39 OS:Any
Assigned to: CPU Architecture:Any

[30 Mar 2018 19:05] Carlos Tutte
Description:
Inconsistency of the result when reducing a string data type (with some character set and collations it truncates perfectly).
When data <= max-size(data_type), there is no size modification.
When data > max-size(data_type), data is truncated, but the size kept is equal to size mod max-size(data-type) instead of max-size(data-type).

Example made from text -> tinytext (max size 255):
When data <= 255, there is no size modification.
When data > 255, data is truncated, but the size kept is = size mod 256.

How to repeat:
------- Non working example ------- 
FROM CHARSET=utf8, NO COLLATION, length=300, 
TO TINYINT CHARSET=utf8 
RESULT length=44
EXPECTED length=255

use test;
DROP TABLE IF EXISTS table1;
CREATE TABLE `table1` (
  id tinyint PRIMARY KEY,
  `summary` text
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;

insert into table1 values (1, 'these are the first chars 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111');
insert into table1 values (2, 'these are the first chars 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111');
insert into table1 values (3, 'these are the first chars
insert into table1 values (4, 'these are the first chars 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111');

select id, summary, length(summary), mod(length(summary), 256) from table1 \G
ALTER TABLE `table1` CHANGE `summary` `summary` TINYTEXT CHARACTER SET utf8;
SHOW WARNINGS;
select id, summary, length(summary), mod(length(summary), 256) from table1 \G

--------- Working example ------------
FROM CHARSET=utf8, NO COLLATION length=537, 
TO TINYINT CHARSET=utf8 COLLATE=utf8_unicode_ci
RESULT length=255
EXPECTED length=255

use test;
DROP TABLE IF EXISTS table1;
CREATE TABLE `table1` (
  id tinyint PRIMARY KEY,
  `summary` text
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;

insert into table1 values (1, 'these are the first chars 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111');
insert into table1 values (2, 'these are the first chars 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111');
insert into table1 values (3, 'these are the first chars 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111');
insert into table1 values (4, 'these are the first chars

select summary, length(summary), mod(length(summary), 256) from table1\G
ALTER TABLE `table1` CHANGE `summary` `summary` TINYTEXT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
SHOW WARNINGS;
select summary, length(summary), mod(length(summary), 256) from table1\G

Suggested fix:
To have a consistent behavior when truncating. The resulting length should be maximun possible for the destination data_type size no matter which collation.
[6 Apr 2018 11:01] MySQL Verification Team
Hello Carlos Tutte,

Thank you for the report and test case.

Thanks,
Umesh