Description:
While LOAD DATA command reads a file content using a default database's character set and mysqlimport command set it to "binary", a file content is read using "binary" character set and collation whatever the real encoding is. This is not a problem unless the real encoding is sjis or cp932. As sjis or cp932 characters may have "\" in their second byte, sjis or cp932 strings are badly escaped if they are read as a "binary" escaped string. This is a famous 5C problem.
IMHO, we cannot avoid this "bad escape" using "binary" charset against sjis/cp932 strings. So, we should have an option on mysqlimport command so that it can set a database character set to sjis/cp932 rather than binary.
How to repeat:
1) Create a table and populate it using 5C character(s)
mysql> use test
mysql> create table sjis_load(a char(100) character set sjis);
mysql> insert into sjis_load values(0x955c); # ่กจ
2) Dump it to a file and truncate the table
mysql> select * into outfile 'sjis_load.txt' from sjis_load;
mysql> truncate sjis_load;
3) Load the table content using mysqlimport command
shell> mysqlimport -uuser -ppassword test /var/lib/mysql/test/sjis_load.txt
Suggested fix:
I have three options.
1. A new option for mysqlimport to specify a charset to use.
2. Let mysqlimport to use --default-character-set option.
3. Allow SELECT ... INTO OUTFILE to dump sjis/cp932 columns using hex format. (Columns can be read correctly if they are not enclosed.) Or let mysqldump to do so.