Bug #17308 mysqlimport on linux doesn't always handle windows-style line-endings sanely
Submitted: 10 Feb 2006 17:08 Modified: 26 Jul 2006 17:58
Reporter: [ name withheld ] Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Command-line Clients Severity:S3 (Non-critical)
Version:5.022-BK, 5.0.18 OS:Any (all)
Assigned to: Magnus Blåudd CPU Architecture:Any

[10 Feb 2006 17:08] [ name withheld ]
Description:
Importing a CSV file that was created on Windows (using non-MySQL tools), mysqlimport correctly separated records based on CRLF. 

theblackbox:~# file tblio_definition_2006.txt
tblio_definition_2006.txt: ASCII English text, with very long lines, with CRLF line terminators
theblackbox:~# mysqlimport --fields-terminated-by=, ccsp3_prelim ~/tblio_definition_2006.txt
ccsp3_prelim.tblio_definition_2006: Records: 361  Deleted: 0  Skipped: 0  Warnings: 249

There were, indeed, 361 records in the file.

But some fields were double-quote enclosed. When specifying optional enclosures, mysqlimport read every other line as a new record.

theblackbox:~# mysqlimport --fields-terminated-by=, --fields-optionally-enclosed-by=\" ccsp3_prelim ~/tblio_definition_2006.txt
ccsp3_prelim.tblio_definition_2006: Records: 181  Deleted: 0  Skipped: 0  Warnings: 217

Every other line was appended to the final field in the previous record.

After converting the file to unix-style line endings, everything worked fine.

theblackbox:~# dos2unix tblio_definition_2006.txt
theblackbox:~# file tblio_definition_2006.txt
tblio_definition_2006.txt: ASCII English text, with very long lines
theblackbox:~# mysqlimport --fields-terminated-by=, --fields-optionally-enclosed-by=\" ccsp3_prelim ~/tblio_definition_2006.txt
ccsp3_prelim.tblio_definition_2006: Records: 361  Deleted: 0  Skipped: 0  Warnings: 0

How to repeat:
Build a CSV file with CRLF record terminators, enclose some fields in quotes, then try to load with mysqlimport on linux.

Suggested fix:
Correct handling of CRLF-terminated records by mysqlimport (on the linux platform) when using --fields-optionally-enclosed-by.
[14 Feb 2006 10:12] Valeriy Kravchuk
Thank you for a problem report. Please, upload that your original tblio_definition_2006.txt file (before dos2unix conversion).
[14 Feb 2006 17:27] [ name withheld ]
CSV file with CRLF line terminations.

Attachment: tblio_definition_2006.txt (text/plain), 35.59 KiB.

[15 Mar 2006 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[15 Mar 2006 8:25] [ name withheld ]
Requested file has been uploaded.
[6 Apr 2006 15:15] Valeriy Kravchuk
Sorry, forgot to ask... Please, send the SHOW CREATE TABLE ccsp3_prelim results also.
[6 Apr 2006 16:18] [ name withheld ]
Here's the CREATE TABLE:

mysql> show create table tblio_definition_2006;
+-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table                 | Create Table                                                                                                                                                                                                                                   |
+-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tblio_definition_2006 | CREATE TABLE `tblio_definition_2006` (
  `ID` int(11) NOT NULL,
  `Name` varchar(100) default NULL,
  `CountyList` varchar(1500) default NULL,
  `GUID` varchar(64) default NULL,
  PRIMARY KEY  (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
[28 Apr 2006 18:18] Valeriy Kravchuk
Verified just as described, with your table and data uploaded, with 5.0.22-BK on Linux.
[21 Jul 2006 12:49] Magnus Blåudd
Use the option --lines-terminated-by="\r\n" for mysqlimport when importing windows-style files. This is unfoirtunately NOT documented in the manual. :)

The mysqlimport tool uses the "LOAD DATA INFILE" function of MySQL Server, one of it's features is to specify what line terminator the file has, this is done with the "LINES TERMINATED BY" option, read more here http://dev.mysql.com/doc/refman/5.0/en/load-data.html
[23 Jul 2006 3:24] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

Adding --lines-terminated-by="\r\n" for Windows file as an
example of the use of this option.
[26 Jul 2006 17:58] [ name withheld ]
Thanks, all.