Bug #5439 | mysql_server_init() crashes if ShiftJIS path is passed | ||
---|---|---|---|
Submitted: | 7 Sep 2004 2:59 | Modified: | 12 Aug 2005 19:55 |
Reporter: | Miguel Solorzano | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Embedded Library ( libmysqld ) | Severity: | S2 (Serious) |
Version: | 4.0.20 | OS: | Windows (Windows 2000) |
Assigned to: | Alexander Barkov | CPU Architecture: | Any |
[7 Sep 2004 2:59]
Miguel Solorzano
[14 Sep 2004 8:09]
Alexander Barkov
This is another bug related to #1405, #1406
[24 Sep 2004 14:25]
Alexander Barkov
This is a very serious problem. In long terms it could be fixed. But a fix would affect very critical code places. Monty told we won't fix this problem in near future.
[13 Jul 2005 9:34]
Alexander Barkov
We think we've found a simple solution for this problem. The source of the problem is that '/' character can be a part of myltibyte sequence on a Japanese cp932 machine. Btw, Chinese cp936 machines should have the same problem, but noone has complained so far. Imagine, we have path "\a/\b\\c", where: - "a/" is a multibyte character. (i.e. "a" is some byte with code > 128). - "b" and "c" are normal latin letters. Internally we convert all slashes into Unix style, then we normalize path in Unix format processing things like "~monty", "/a/../b/", and "/a//b//c", and then we convert back to Windows style. As a result, the last path is packed into "\a\b\c". If "a" is a first byte of multi-byte chracter (i.e. with code > 128), then path gets grabled. The idea is to replace slashes which are parts of multibyte sequence with some control character, say 0x01, before converting into Unix style. 0xFF can possibly also work as a replacement character, as it is not used in cp932. In order to do that correctly we need to know the file system character set. It can be fetched using GetLocaleInfo() Windows API function. Then, in a loop, we get character by character, and replace slashes which are multibyte parts with the replacement character: Using "\a/\b\\c" as example and assuming again that "a" is a part multibyte character (with code > 128) and "b" and "c" are normal latin letters (with code < 128), processing will look like this: "\" -> "\" "a/" -> "a" + @ "\" -> "\" "b" -> "b" "\" -> "\" "\" -> "\" "c" -> "c" Now we have "\a@\b\\c" string, where @ is the replacement character ( byte with code 0x01 or 0xFF). After this preparation we can safely convert to Unix style: "/a@/b//c" execute path normalization, which among other things (see above) removes double slashes: "/a@/b//c" -> "/a@/b/c". Now we convert to Windows style again, "\a@\b\c" And finally, replace replacement characters back to slashes: "\a/\b\c". This is exactly what we need, instead of "\a\b\c" we get currently. The change looks very simple: 1. We add "file_system_character_set" global variable and initialize it during mysqld startup using GetLocaleInfo() and mapping from Window character set names into MySQL names. We even don't need to map all character sets, we only need to detect cp932 and cp936. 2. In intern_filename() we add a loop to replace "/" (which are multibyte parts) to replacement character. 3. In unpack_filename() we add a loop to replace 0x01 back to "/".
[13 Jul 2005 9:38]
Alexander Barkov
Brian, PeterG and Trudy think it should work. We all agreed we also want Serg to take a look. I'm setting Serg as reviewer.
[27 Jul 2005 11:13]
Alexander Barkov
There is a mistake in the previous comment. "/" 0x2F cannot be a multibyte part "\" 0x5C can
[8 Aug 2005 14:57]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/27998
[9 Aug 2005 4:22]
Alexander Barkov
Approved by Serg. Pushed into 4.1.14, queued to be merged into 5.0.12.
[12 Aug 2005 19:55]
Paul DuBois
Noted in 4.1.14, 5.0.12 changelogs.