Bug #12448 LOAD DATA / 'SELECT INTO OUTFILE doesn't work with multibyte path name
Submitted: 9 Aug 2005 1:56 Modified: 2 Feb 2006 4:37
Reporter: Shuichi Tamagawa Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.10,4.1.15 OS:Windows (Windows XP)
Assigned to: Alexander Barkov CPU Architecture:Any

[9 Aug 2005 1:56] Shuichi Tamagawa
Description:
'LOAD DATA' and 'SELECT ... INTO OUTFILE' doesn't handle the multi-byte path name properly.

Example on Japanese Windows environment.

[LOAD DATA]
When loading the data file from the directry 'C:\\mysql\\XXX\\test.txt' where XXX is multi-byte characters, MySQL tries to read the file with the name of 'YYYtest.txt' from the parent directory 'C:\\mysql'. (YYY is XXX converted to UTF8 encoding and displayed in cp932)

[SELECT ... INTO OUTFILE]
When exporting the data using SELECT ... INTO OUTFILE statement to the directry 'C:\\mysql\\XXX\\test.txt' where XXX is multi-byte characters, MySQL creates a file with the name of 'YYYtest.txt' in the parent directory 'C:\\mysql'. (YYY is XXX converted to UTF8 encoding and displayed in cp932)

How to repeat:
[LOAD DATA]
mysql> load data local infile 'C:\\mysql\\テスト\\test.txt' into table t1;
ERROR 2 (HY000): File 'C:\mysql\繝・せ繝・test.txt' not found (Errcode: 2)

[SELECT ... INTO OUTFILE]
mysql> select * from t1 into outfile 'C:\\mysql\\テスト\\test.txt';
Query OK, 1 row affected (0.01 sec)

Look into the directory 'C:\mysql\'. There is the file with garbled name.

Suggested fix:
This is related to http://bugs.mysql.com/bug.php?id=5439
[31 Aug 2005 15:55] Jorge del Conde
I verified this bug using 5.0.11 under WinXP
[24 Oct 2005 7:43] Alexander Barkov
The proble is not very simple:

1. Most Unixes do not have a character set associated
with file system. A file name is just a binary string. File
name specified in "LOAD DATA INFILE" should probably
not be converted.

2. MacOSX filesystem uses utf-8. Thus, file name
specified from "LOAD DATA INFILE" should
probably be converted to utf-8. (This is what happens now)

3. On Windows, there is always a character set associated
with filesystem, which depens on localization and can be
read using GetLocaleInfo Windows API function. FIle name
give in "LOAD DATA INFILE" should probably be converted
to the locale character set.

All of the above cases can be solved by introducing a new
session variable character_set_filesystem.
All file names specified in LOAD  DATA INFILE should be
converted from character_set_client to character_set_filesystem.

character_set_filesystem can be "binary" by default, which means
no file name conversion. On MacOSX and Windows one would
be able to write character_set_filesystem=charset in server's my.cnf
file to activate  character set conversion of file names given in 
"LOAD DATE INFILE".

For example, on a Western Windows machine one would write
character_set_filesystem=latin1. "LOAD DATA INFILE" would
work in all cases:
- from "mysql.exe" command line client working in cp850
- from MySQL GUI tools working in utf-8
- from other tools working in the locale charset: latin1
[18 Jan 2006 9:06] Alexander Barkov
A patch has been commited:

http://lists.mysql.com/commits/1243
[18 Jan 2006 9:46] Alexander Barkov
There is a typo in the commit comment:

  sql_yacc.yy:
    Adding TEXT_STRING_filesystem, which
    converts from character_set_client to
    character_set_conversion.

It should have been:

  sql_yacc.yy:
    Adding TEXT_STRING_filesystem, which
    converts from character_set_client to
    character_set_filesystem.
[19 Jan 2006 10:03] Alexander Barkov
Fixed in 5.1.6 by introducing a new system variable: character_set_filesystem.

Please document the bug fix as well as  the new variable.
See details in the bug report.
[2 Feb 2006 4:37] Mike Hillyer
Documented in 5.1.6 changelog:

    <listitem>
        <para>
          Multi-byte path names for <literal>LOAD DATA</literal> and
          <literal>SELECT ... INTO OUTFILE</literal> caused errors. (Bug
          #12448)
        </para>
      </listitem>

Variable already documented in manual by Paul.
[15 Feb 2006 17:20] Paul DuBois
This fix has been backported to 5.0.19 as well.