Bug #3906 | CREATE TABLE does not save the Japanese table names correctly in UTF8. | ||
---|---|---|---|
Submitted: | 27 May 2004 4:40 | Modified: | 26 Jan 2006 2:51 |
Reporter: | Shuichi Tamagawa | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 5.0 | OS: | Windows (Window XP Professional) |
Assigned to: | Alexander Barkov | CPU Architecture: | Any |
[27 May 2004 4:40]
Shuichi Tamagawa
[28 May 2004 14:06]
Alexander Barkov
Shuichi, I could not reproduce this bug, and successfully created all these tables simultaniously. Can you please try the follwing: SELECT HEX(COVERT(x USING utf8)); where x in 0x82a0, 0x82a2, 0x82a4, etc. Does conversion return the same result for them, or different?
[28 May 2004 22:56]
Shuichi Tamagawa
When I use the convert function, it returns the different resuts for these characters. But when I create tables, it looks these are treated as the same characters. See the attached file. Note that these tables are created successfully in Version 4.1.18.
[28 May 2004 22:59]
Shuichi Tamagawa
Test result - 4.1.1a-alpha-nt
Attachment: example4.1.1a.JPG (image/pjpeg, text), 73.52 KiB.
[28 May 2004 22:59]
Shuichi Tamagawa
Test result - 4.0.18
Attachment: example4.0.18.JPG (image/pjpeg, text), 35.88 KiB.
[3 Jun 2004 8:22]
Alexander Barkov
Shuichi, It looks like Windows doesn't allow to create files with these names at the same time. Can you please try to create files with these names: 0xE38182 0xE38184 0xE38186 0xE38188 0xE3818A Does Windows allow to do it? Thank you! Sorry, I cannot try it, I don't have a Windows box to try on.
[3 Jun 2004 10:37]
Shuichi Tamagawa
Bar, I think this issue can be explained as follows: 1.Windows File system uses Shift-JIS. 2.When MySQL set the file name '0xE38182' in UTF8, Windows interprets it in Shift-JIS. 3.In Shift-JIS '0xE38182' is interpreted as two characters, '0xE381' and '0x82'. 4.Since there are no corresponding character for code point '0x82' in Shift-JIS, it is expressed in another character(It looks '0x8145'). 5.Since there are no corresponding character for code point from '0x80' to '0x9E' these codes are expressed in the same character, which result in file name confliction. When I tried some other characters like '0xEFBC91', '0xEFBC92', '0xEFBCA1' it behaved in the same way. If my understanding is right, the file name of table data need to be set in Shift-JIS on Windows environment to avoid this problem.
[3 Jun 2004 11:44]
Alexander Barkov
I agree that it is wrong that Japanese names are stored in UTF8 on a SJIS system. Using SJIS for file system is not good too, we want table files to be copied between boxes, including between boxes with different localization. The only way is to store names in unicode, but not in UTF8. It could be a pseudo character set which encodes ASCII as is, but all national characters, including latin letters with accents, Japanes, Cyrillic, etc, in their Unicode code in textual representation: For example: ab12.frm -> ab12, pure ASCII name ab12%3062 -> ab12 + Japanese character corresponding to SJIS 0x82C0 ab12%00E5.frm -> ab12 + LATIN LETTER A WITH RING ABOVE (Swedish letter) ab12%0430.frm -> ab12 + CYRILLIC LETTER C What do you think?
[3 Jun 2004 15:53]
Sergei Golubchik
see WL#1324 by the way, you cannot give % a special meaning, as old tables with % in the name will become unreadable (consider, e.g. CREATE TABLE `ab12%3062` (...))
[4 Jun 2004 14:53]
Alexander Barkov
Serge, what is your erstimation of the priority of this task?
[7 Jun 2004 13:17]
Alexander Barkov
Serge, what should I do with this bug? Should i close it with a reference to the worklog item?
[27 Aug 2004 13:35]
Alexander Barkov
A relative problem happens on MAC OSX. This indicates that it's currently not possible to use non-ASCII characters for identifiers on Mac OS X, as the file system would mangle the file name. See http://developer.apple.com/documentation/MacOSX/Conceptual/SystemOverview/FileSy stem/chapter_8_section_6.html
[24 Sep 2004 14:31]
Alexander Barkov
We won't fix this problem in 4.1. It's to be fixed in 5.0 or 5.1. We should add a note about this problem into the manual.
[7 May 2005 8:20]
Alexander Barkov
See also bug#5566
[12 Jan 2006 7:54]
Alexander Barkov
This problem was fixed in 5.1 after pushing WL#1324 "table name to file name encoding". We will not fix this problem in 4.1 and 5.0.
[18 Jan 2006 19:14]
Mike Hillyer
Need version number for the fix.
[19 Jan 2006 7:03]
Alexander Barkov
Fixed in 5.1.6. P.S. Please make sure "table name to file name encoding" is documenting together with this bug. Stefan told he assigned it to Paul.
[26 Jan 2006 2:51]
Mike Hillyer
Documented in 5.1.6 changelog: <listitem> <para> Certain Japanese table names were not properly saved during a <literal>CREATE TABLE</literal> statement. (Bug #3906) </para> </listitem> Will followup with Paul.