Bug #36418 | Character sets: crash if char(256 using utf32) | ||
---|---|---|---|
Submitted: | 30 Apr 2008 0:26 | Modified: | 28 Jul 2008 20:50 |
Reporter: | Peter Gulutzan | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
Version: | 6.0.6-alpha-debug | OS: | Linux (SUSE 10 | 32-bit) |
Assigned to: | Alexander Barkov | CPU Architecture: | Any |
Tags: | regression |
[30 Apr 2008 0:26]
Peter Gulutzan
[30 Apr 2008 6:18]
Valeriy Kravchuk
This is a regression bug. There is no crash with 6.0.4: C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot test -P3311 Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 6.0.4-alpha-community MySQL Community Server (GPL) Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> create table t (s1 varchar(1) character set utf32, s2 text character set utf32) -> engine=falcon; Query OK, 0 rows affected (0.59 sec) mysql> create index i on t (s1); Query OK, 0 rows affected (0.11 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> insert into t values (char(256 using utf32), char(256 using utf32) ); ERROR 1300 (HY000): Invalid utf32 character string: '010000'
[30 Apr 2008 11:17]
MySQL Verification Team
Thank you for the bug report. Server version: 6.0.6-alpha-debug Source distribution Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> create table t (s1 varchar(1) character set utf32, s2 text character set utf32) -> engine=falcon; Query OK, 0 rows affected (0.12 sec) mysql> create index i on t (s1); Query OK, 0 rows affected (0.20 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> insert into t values (char(256 using utf32), char(256 using utf32) ); ERROR 2013 (HY000): Lost connection to MySQL server during query mysql>
[19 May 2008 11:00]
Alexander Barkov
Peter, what is expected result for CHAR(256) ? http://dev.mysql.com/doc/refman/6.0/en/string-functions.html#function_char says that: CHAR(256) is equivalent to CHAR(1,0) which is 0x0100 and is a too short UTF-32 sequence. Should a warning / error be generated? Or should it try to zero-pad the numbers automatically, if they use less than "mbminlen" bytes for the output character set? I'd prefer leading zero padding. I.e. (assuming utf32) - Every number is auto-extended to 4 bytes. CHAR(0x010203 using utf32) -> 0x00010203 CHAR(0x0203 using utf32) -> 0x00000203 CHAR(0x03 using utf32) -> 0x00000003 - Every integer argument is padded separately, different arguments do not interfere to each other in sense of padding: CHAR(0x01, 0x02 using utf32) -> 0x0000000100000002 not 0x00000102 i.e. "pad all numbers then concat" VS "concat all numbers then pad".
[22 May 2008 22:44]
Peter Gulutzan
Bar asked: > Peter, what is expected result for CHAR(256) ? For "CHAR(256 USING UTF32)" I expect rules like for CHAR(256 USING UCS2)". I think that means that I think you are right. I believe that the rules you propose for UTF32 are like the rules that we have now for UCS2. That is: * leading zero padding * every number is auto-extended to 2 bytes * select hex( CHAR(0x01, 0x02 using ucs2)) yields 00010002, so apparently integer arguments are padded separately It's true, the manual says that CHAR(1,0) is the same as CHAR(256). But I don't care, the manual doesn't say that CHAR(1,0 USING UTF32) is the same as CHAR(256 USING UTF32). This doesn't, in my opinion, change documented behaviour. And it's already the case that select hex( CHAR(1, 0 using ucs2)) yields 00010000 while select hex( CHAR(256 using ucs2)) yields 0100
[26 May 2008 13:04]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/47055 ChangeSet@1.2643, 2008-05-26 17:59:08+05:00, bar@mysql.com +6 -0 Bug#36418 Character sets: crash if char(256 using utf32) Problem: CHAR(256 USING utf32) could generate a result with incorrect length, which resulted into server crash. Fix: CHAR() now generates results with correct lengths, taking into account "mbminlen" of the character set.
[3 Jul 2008 9:23]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/48948 2672 Alexander Barkov 2008-07-03 Bug#36418 Character sets: crash if char(256 using utf32) Problem: CHAR(256 USING utf32) could generate a result with incorrect length, which resulted into server crash. Fix: CHAR() now generates results with correct lengths, taking into account "mbminlen" of the character set. mysql-test/r/ctype_ucs.result: mysql-test/r/ctype_utf32.result: mysql-test/t/ctype_ucs.test: mysql-test/t/ctype_utf32.test: Adding tests sql/item_strfunc.cc Fixing to append all multi-byte characters as a single buffer, instead of appending one-by-one. This is important for "real" multi-byte character sets like UCS2 and UTF32. sql/sql_string.cc Handling correctly a case when a UCS2 or UTF32 string is appended with a binary string: zero-pad the binary argument before concatenation, to make it have correct length (e.g. 0x01 -> 0x00000001 in case of UTF32).
[16 Jul 2008 10:45]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/49816 2719 Alexander Barkov 2008-07-16 Bug#36418 Character sets: crash if char(256 using utf32) Problem: CHAR(256 USING utf32) could generate a result with incorrect length, which resulted into server crash. Fix: CHAR() now generates results with correct lengths, taking into account "mbminlen" of the character set. mysql-test/r/ctype_ucs.result: mysql-test/r/ctype_utf32.result: mysql-test/t/ctype_ucs.test: mysql-test/t/ctype_utf32.test: Adding tests sql/item_strfunc.cc Fixing to append all multi-byte characters as a single buffer, instead of appending one-by-one. This is important for "real" multi-byte character sets like UCS2 and UTF32. sql/sql_string.cc Handling correctly a case when a UCS2 or UTF32 string is appended with a binary string: zero-pad the binary argument before concatenation, to make it have correct length (e.g. 0x01 -> 0x00000001 in case of UTF32).
[17 Jul 2008 7:19]
Alexander Barkov
Pushed into mysql-6.0.6-bugteam.
[18 Jul 2008 9:37]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/50013 2725 Georgi Kodinov 2008-07-18 Bug#36418 addendum: fixed a C++ specific construct in a C file.
[18 Jul 2008 9:38]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/50014 2725 Georgi Kodinov 2008-07-18 Bug#36418 addendum: fixed a C++ specific construct in a C file.
[18 Jul 2008 11:40]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/50026 2726 Sven Sandberg 2008-07-18 [merge] automerge
[28 Jul 2008 14:45]
Bugs System
Pushed into 6.0.7-alpha (revid:alik@mysql.com-20080725172155-fnc73o50e4tgl23k) (version source revid:alik@mysql.com-20080725172155-fnc73o50e4tgl23k) (pib:3)
[28 Jul 2008 20:50]
Paul DuBois
Noted in 6.0.7 changelog. CHAR(256 USING utf32) could generate a result with an incorrect length and result in a server crash.
[13 Sep 2008 23:38]
Bugs System
Pushed into 6.0.7-alpha (revid:kgeorge@mysql.com-20080718093719-5927sojsbr8s73nw) (version source revid:john.embretsen@sun.com-20080808091208-ht48kyzsk7rim74g) (pib:3)