Bug #15377 Valid multibyte sequences are truncated on INSERT
Submitted: 1 Dec 2005 8:19 Modified: 2 Feb 2006 0:53
Reporter: Alexander Barkov Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.1 OS:
Assigned to: Alexander Barkov CPU Architecture:Any
Tags: corruption, myisam

[1 Dec 2005 8:19] Alexander Barkov
Description:
0xA2E6 is a valid EUC-KR doube byte sequence.
However it is truncated on INSERT, and an empty
string in stored into a column.

How to repeat:
mysql> drop table if exists t1;
mysql> create table t1 (a varchar(10) character set euckr);
mysql> insert into t1 values (0xA2E6);
Query OK, 1 row affected, 1 warning (0.00 sec)
mysql> show warnings;
+---------+------+----------------------------------------+
| Level   | Code | Message                                |
+---------+------+----------------------------------------+
| Warning | 1265 | Data truncated for column 'a' at row 1 |
+---------+------+----------------------------------------+
1 row in set (0.00 sec)
mysql> select * from t1;
+------+
| a    |
+------+
|      |
+------+
1 row in set (0.01 sec)

Suggested fix:
Fix to allow inserting of this character, and other 528 characters
in the range 0xA2E6..0xFEF7.
[1 Dec 2005 8:28] Alexander Barkov
The same happens with 733 valid gb2312 codes
in the range 0xA2A1..0xD7FE

mysql> drop table t1;
mysql> create table t1 (a varchar(10) character set gb2312);
mysql> insert into t1 values (0xA2A1);
Query OK, 1 row affected, 1 warning (0.00 sec)
mysql> show warnings;
+---------+------+----------------------------------------+
| Level   | Code | Message                                |
+---------+------+----------------------------------------+
| Warning | 1265 | Data truncated for column 'a' at row 1 |
+---------+------+----------------------------------------+
1 row in set (0.00 sec)
mysql> select * from t1;
+------+
| a    |
+------+
|      |
+------+
1 row in set (0.00 sec)
[9 Dec 2005 12:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/54
[13 Jan 2006 18:54] Alexander Barkov
Fixed in 4.1.17 and 5.0.19
[2 Feb 2006 0:53] Mike Hillyer
Documented in 4.1.17 and 5.0.19 changelogs:

      <listitem>
        <para>
          Characters in the <literal>gb2312</literal> and <literal>euckr</literal> character sets which did
          not have Unicode mappings were truncated. (Bug #15377)
        </para>
      </listitem>