Description:
Cannot load data in utf-8 character set from a file using LOAD DATA INFILE command. It ends up in the following error:
ERROR 1366 (HY000): Incorrect string value: '\xC4\xB1\xC4\x9F\xC3\xBC...' for column 'Z1' at row 1
The file contains the single line given below (formatted as utf-8 web for bug submission, normally it is in utf8 binary in file)
ŊĸĕŸďĀʼnķõŷïà|ŊĸĕŸďĀʼnķõŷïà|ŊĸĕŸďĀʼnķõŷïà|ŊĸĕŸďĀʼnķõŷïà|ŊĸĕŸďĀʼnķõŷïà
The details of character set related variables are (SHOW VARIABLES LIKE 'character%';):
| character_set_client | utf8
| character_set_connection | utf8
| character_set_database | utf8
| character_set_filesystem | binary
| character_set_results | utf8
| character_set_server | utf8
| character_set_system | utf8
| character_sets_dir | J:\Program Files\MySQL\MySQL Server 5.0\share\charsets\
How to repeat:
Create a database using the following commands (note: this uses the default engine innodb):
CREATE SCHEMA ABC;
CREATE TABLE ZXY
( Z1 VARCHAR(30) NOT NULL,
Z2 VARCHAR(150),
Z3 VARCHAR(250),
Z4 VARCHAR(250),
Z5 VARCHAR(30),
PRIMARY KEY (Z1));
ALTER DATABASE ABC CHARACTER SET 'utf8' COLLATE 'utf8_turkish_ci';
ALTER TABLE ZXY CHARACTER SET 'utf8' COLLATE 'utf8_turkish_ci';
Create a utf-8 formatted, unsigned file and put the line above in the description into the file (or I guess any bunch of non-latin1 characters will do). UTF-8 signed files also create error.
Try loading it using the command below. If necessary, modify the line ending. This is tested on windows, and the file is one line anyway.
LOAD DATA INFILE '<<THE PATH TO YOUR FILE>>' INTO TABLE ZXY CHARACTER SET 'utf8' FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
Then the error in the description is produced.
Suggested fix:
Is there a filter that rejects non-printable part of ascii that gets applied to all char or varchar fields when loading data from file?
Description: Cannot load data in utf-8 character set from a file using LOAD DATA INFILE command. It ends up in the following error: ERROR 1366 (HY000): Incorrect string value: '\xC4\xB1\xC4\x9F\xC3\xBC...' for column 'Z1' at row 1 The file contains the single line given below (formatted as utf-8 web for bug submission, normally it is in utf8 binary in file) ŊĸĕŸďĀʼnķõŷïà|ŊĸĕŸďĀʼnķõŷïà|ŊĸĕŸďĀʼnķõŷïà|ŊĸĕŸďĀʼnķõŷïà|ŊĸĕŸďĀʼnķõŷïà The details of character set related variables are (SHOW VARIABLES LIKE 'character%';): | character_set_client | utf8 | character_set_connection | utf8 | character_set_database | utf8 | character_set_filesystem | binary | character_set_results | utf8 | character_set_server | utf8 | character_set_system | utf8 | character_sets_dir | J:\Program Files\MySQL\MySQL Server 5.0\share\charsets\ How to repeat: Create a database using the following commands (note: this uses the default engine innodb): CREATE SCHEMA ABC; CREATE TABLE ZXY ( Z1 VARCHAR(30) NOT NULL, Z2 VARCHAR(150), Z3 VARCHAR(250), Z4 VARCHAR(250), Z5 VARCHAR(30), PRIMARY KEY (Z1)); ALTER DATABASE ABC CHARACTER SET 'utf8' COLLATE 'utf8_turkish_ci'; ALTER TABLE ZXY CHARACTER SET 'utf8' COLLATE 'utf8_turkish_ci'; Create a utf-8 formatted, unsigned file and put the line above in the description into the file (or I guess any bunch of non-latin1 characters will do). UTF-8 signed files also create error. Try loading it using the command below. If necessary, modify the line ending. This is tested on windows, and the file is one line anyway. LOAD DATA INFILE '<<THE PATH TO YOUR FILE>>' INTO TABLE ZXY CHARACTER SET 'utf8' FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; Then the error in the description is produced. Suggested fix: Is there a filter that rejects non-printable part of ascii that gets applied to all char or varchar fields when loading data from file?