Bug #14477 LOAD DATA fails if non-ascii character in data
Submitted: 30 Oct 2005 0:11 Modified: 30 Oct 2005 14:22
Reporter: David Fuess Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.15-nt OS:Windows (Windows XP)
Assigned to: CPU Architecture:Any

[30 Oct 2005 0:11] David Fuess
Description:
When loading a database that is in CSV format where some of the fields contain extended characters like the umlaut or accented chars. The load fails with:

Data too long for column 'city' at row 265

Even though the length of the field in the fil is far less than the fieldwidth. I have tried this with utf1, latin1, and ascii with the same result every time. I have also tried two different data sets.

How to repeat:
The table is:

CREATE TABLE city (
CityId INTEGER NOT NULL,
CountryID INTEGER NOT NULL,
RegionID INTEGER NOT NULL,
City VARCHAR(200) NOT NULL DEFAULT '',
Latitude DECIMAL(12,7) NOT NULL DEFAULT '0.0000000',
Longitude DECIMAL(12,7) NOT NULL DEFAULT '0.0000000',
TimeZone VARCHAR(10) NOT NULL DEFAULT '',
DmaId VARCHAR (10) NULL DEFAULT 0,
Code CHAR(4)
) ENGINE=InnoDB DEFAULT CHARSET 'latin1';

The load statement:

LOAD DATA
INFILE 'C:\\Temp\\GeoMap\\Cities.txt'
INTO TABLE city
FIELDS
TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '\\'
LINES
STARTING BY ''
TERMINATED BY '\r\n'
IGNORE 1 LINES;

and line 265 in the data file is:

14766,91,1912,"Neuenbürg","53.383","7.95","+01:00",,"NEUE"
[30 Oct 2005 10:06] Valeriy Kravchuk
Thank you for a problem report. Please, send the results of the following command

SHOW VARIABLES like 'character%';

executed from the mysql client session where you try to execute that LOAD DATA statement. 

Can you repeat the same behaviour with only this line in the file?
[30 Oct 2005 14:14] David Fuess
Yikes, I went to verify using a one line input surce and it loaded. I then tried the bulk file and it loaded as well! The only change is I had to reboot because of a crash in mysqld. There could be an issue of interference between 5.0.13-RC and 5.0.15 if you don't do a reboot in between. I will continue to load data files and check other character sets as well. But it appears to be functioning nominally right now. You can close this one and I'll initiate a new report if I uncover anything repeatable.
[30 Oct 2005 14:22] Valeriy Kravchuk
Closed as requested by reporter. 

Please, add comment to this issue if you'll have similar repetable problem in the future.