Bug #11987 | mysql will truncate the text when the text contain GBK char:"0xA3A0" and "0xA1" | ||
---|---|---|---|
Submitted: | 17 Jul 2005 3:56 | Modified: | 3 Aug 2005 20:46 |
Reporter: | haka haka | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 4.1 .. 5.0 | OS: | Any (*) |
Assigned to: | Alexander Barkov | CPU Architecture: | Any |
Tags: | corruption, myisam |
[17 Jul 2005 3:56]
haka haka
[17 Jul 2005 12:21]
Aleksey Kishkin
can it be dublicate of http://bugs.mysql.com/bug.php?id=10903 ?
[18 Jul 2005 10:56]
Aleksey Kishkin
Test case DROP TABLE IF EXISTS `test`.`gbtest`; CREATE TABLE `gbtest` ( `content` longtext ) ENGINE=InnoDB DEFAULT CHARSET=gb2312; INSERT INTO gbtest VALUES(_gb2312 0xB0B0B0B0A3C1B0C0B0C0); INSERT INTO gbtest VALUES(_gb2312 0xB0B0B0B0A3A0B0C0B0C0); in result gbtest must contain 2 rows with the same length. (But as the matter of fact first row is longer)
[19 Jul 2005 6:47]
Alexander Barkov
I cannot reproduce this problem with 0xA1A1, but can with A3A0: DROP TABLE IF EXISTS t1; CREATE TABLE t1 (a longtext) CHARSET=gb2312; INSERT INTO t1 VALUES(_gb2312 0xB0B0B0B0A1A1B0C0B0C0); INSERT INTO t1 VALUES(_gb2312 0xB0B0B0B0A3A0B0C0B0C0); SELECT hex(a) from t1; This is the result: hex(a) B0B0B0B0A1A1B0C0B0C0 B0B0B0B0 So, A1A1 does not cut the string. A3A0 does. However, according to this page, A3A0 is an undefined character in GBK: http://www.microsoft.com/globaldev/reference/dbcs/936/936_A1.mspx Can you please confirm that A3A0 is an undefinite character in GBK? If yes, what is the reason to store undefined characters? If not, can you please give some URLs proving this character to be defined? Thanks!
[20 Jul 2005 14:19]
haka haka
In this page,http://www.microsoft.com/globaldev/reference/dbcs/936/936_A3.mspx MS has said that A3A0 means nothing. But,In GBK standard,the encoding rule is: 0x81<char1<0xFE 0x40<ch2<0x7E,0x80<ch2<0xFE(0x7F is realy not exists) I have down a gbk char definition file,the 0xA3A0 have been define.and I can get "A1A1" char by using MS PINGYIN2003. You can get the gbk char definition file in attachment.
[22 Jul 2005 16:05]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/27483
[22 Jul 2005 16:19]
Alexander Barkov
Fixed in 4.1.14 and 5.0.11.
[3 Aug 2005 20:46]
Mike Hillyer
Documented in 5.0.11 and 4.1.14 changelogs: <listitem><para>Character data truncated when GBK characters </literal>0xA3A0</literal> and <literal>0xA1</literal> are present. (Bug #11987)</para></listitem>