Bug #26058 | Falcon: warnings with yen sign and overline in ujis | ||
---|---|---|---|
Submitted: | 4 Feb 2007 1:08 | Modified: | 21 Dec 2007 10:29 |
Reporter: | Peter Gulutzan | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S3 (Non-critical) |
Version: | 5.2.2-falcon-alpha-debug-log | OS: | Linux (SUSE 10.0 / 64-bit) |
Assigned to: | Kevin Lewis | CPU Architecture: | Any |
[4 Feb 2007 1:08]
Peter Gulutzan
[5 Feb 2007 16:29]
MySQL Verification Team
Thank you for the bug report. Verified as described.
[3 May 2007 19:17]
Hakan Küçükyılmaz
Added test case falcon_bug_26058.test.
[10 May 2007 20:11]
Kevin Lewis
I am not sure that this is a valid character in the ujis collation. Please verify. When I run the test against a MyISQM file, no errors occur. The test case references the characters in ASCII format like this; # '‾' == 0xE2 0x80 0xBE == U+00E2 U+20AC U+00BE # '¥' == 0xC2 0xA5 == U+00E2 U+20AC The MySQL results are like this; SET NAMES utf8; CREATE TABLE tm (a varchar(5) character set ujis) engine=myisam; INSERT INTO tm VALUES ('¥'); INSERT INTO tm VALUES ('‾'); SELECT hex(a) FROM tm WHERE a IS NOT NULL; hex(a) 8E5C 8E7E SELECT hex(a) FROM tm; hex(a) 8E5C 8E7E Everything seems to work without error, but these bytes sequences are never validated in the function my_well_formed_len_ujis() in ctype-uhis.c. When this same SQL is run against a Falcon table, the code path calls my_well_formed_len_ujis() via well_formed_copy_nchars() in sql_string.cc, via Field_varstring::store() in field.cc. The warning originates in the following code in my_well_formed_len_ujis(), line 8276; if (ch == 0x8E) /* [x8E][xA0-xDF] */ { if (*b >= 0xA0 && *b <= 0xDF) continue; *error= 1; return (uint) (chbeg - beg); /* invalid sequence */ } According to this code, 0x8E5c and 0x8E7E are invalid ujis characters. I do not understand how '¥' and '‾' got converted to 0x8E5c and 0x8E7E, but if that is correct, then this seems to be an invalid test case.
[14 May 2007 15:18]
Peter Gulutzan
First, I must apologize for a small error in the "how to repeat" section. I should have used 'character set ujis' in both examples. But the error does not affect the results or the bug description. This is the corrected "how to repeat": mysql> set names utf8; Query OK, 0 rows affected (0.00 sec) mysql> create table tujis (s1 varchar(5) character set ujis) engine=falcon; Query OK, 0 rows affected (0.01 sec) mysql> insert into tujis values ('¥'),('‾'); Query OK, 2 rows affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> select count(*) from tujis where s1 is not null; +----------+ | count(*) | +----------+ | 2 | +----------+ 1 row in set, 2 warnings (0.00 sec) mysql> show warnings; +---------+------+----------------------------------------------------------+ | Level | Code | Message | +---------+------+----------------------------------------------------------+ | Warning | 1366 | Incorrect string value: '\x8E\' for column 's1' at row 0 | | Warning | 1366 | Incorrect string value: '\x8E~' for column 's1' at row 1 | +---------+------+----------------------------------------------------------+ 2 rows in set (0.00 sec) mysql> drop table tujis; Query OK, 0 rows affected (0.01 sec) mysql> set names utf8; Query OK, 0 rows affected (0.00 sec) mysql> create table tujis (s1 varchar(5) character set ujis) engine=myisam; Query OK, 0 rows affected (0.01 sec) mysql> insert into tujis values ('¥'),('‾'); Query OK, 2 rows affected (0.01 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> select count(*) from tujis where s1 is not null; +----------+ | count(*) | +----------+ | 2 | +----------+ 1 row in set (0.00 sec) mysql> show warnings; Empty set (0.00 sec) Second, I must emphasize what I said in the original: "this is junk". It is not proper to do these insertions. The complaint is solely: if I do these insertions anyway, I should get the same results for Falcon and MyISAM. I do not. Third, I acknowledge that this is a "Linux only" bug. If I can't type in these UTF8 characters, I can't reproduce.
[14 May 2007 16:11]
Kevin Lewis
Well, I must appologize for not being clear myself. I think that the warning should occur in MyISAM! I get the exact same warning running the testcase on Windows, even though I cannot reproduce it within MYSQL.EXE. So the real question is; Should values ('¥'),('‾') get converted to 0x8E5c and 0x8E7E? These are the byte sequences that the MySQL engine is warning about when Falcon calls Field_varstring::store() which MyISAM does not call.
[14 Jun 2007 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[21 Jun 2007 20:40]
Peter Gulutzan
I expect that the symptoms will disappear after the fix to Bug#28600 Yen sign and overline ujis conversion change Then it should be possible to close this bug with no further work.
[19 Oct 2007 18:00]
Ann Harrison
Miguel, Now that bug 28600 has been fixed, this bug should also go away. Would you retest it please? Thanks, Ann
[20 Nov 2007 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[21 Dec 2007 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[21 Dec 2007 10:29]
MySQL Verification Team
I wasn't able to repeat anymore with current source.